PHP function to get the most recurrent words in a file - it uses Linux command line

August 3rd, 2010

/**
* Extracts the most recurrent one-word and two-word terms in a file
* Filters out some common stop words and you can also pass extra ones
*
* @param string $filepath
* @param int $minWordLength - the minimal word length for the terms to extract
* @param int $numberOfTerms - the number of terms to retrieve
* @return array of string - the most recurrent terms
*/
function getMostRecurrentTermsInFile ($filepath, $minWordLength, $numberOfTerms, array $extraStopWords)
{
$stopwords = array(’a', ‘about’, ‘above’, ‘above’, ‘across’, ‘after’, ‘afterwards’, ‘again’, ‘against’, ‘all’,
‘almost’, ‘alone’, ‘along’, ‘already’, ‘also’,'although’,'always’,'am’,'among’, ‘amongst’, ‘amoungst’, ‘amount’,
‘an’, ‘and’, ‘another’, ‘any’,'anyhow’,'anyone’,'anything’,'anyway’, ‘anywhere’, ‘are’, ‘around’, ‘as’, ‘at’,
‘back’,'be’,'became’, ‘because’,'become’,'becomes’, ‘becoming’, ‘been’, ‘before’, ‘beforehand’, ‘behind’, ‘being’,
‘below’, ‘beside’, ‘besides’, ‘between’, ‘beyond’, ‘bill’, ‘both’, ‘bottom’,'but’, ‘by’, ‘call’, ‘can’, ‘cannot’,
‘cant’, ‘co’, ‘con’, ‘could’, ‘couldnt’, ‘cry’, ‘de’, ‘describe’, ‘detail’, ‘do’, ‘done’, ‘down’, ‘due’, ‘during’,
‘each’, ‘eg’, ‘eight’, ‘either’, ‘eleven’,'else’, ‘elsewhere’, ‘empty’, ‘enough’, ‘etc’, ‘even’, ‘ever’, ‘every’,
‘everyone’, ‘everything’, ‘everywhere’, ‘except’, ‘few’, ‘fifteen’, ‘fify’, ‘fill’, ‘find’, ‘fire’, ‘first’, ‘five’,
‘for’, ‘former’, ‘formerly’, ‘forty’, ‘found’, ‘four’, ‘from’, ‘front’, ‘full’, ‘further’, ‘get’, ‘give’, ‘go’, ‘had’,
‘has’, ‘hasnt’, ‘have’, ‘he’, ‘hence’, ‘her’, ‘here’, ‘hereafter’, ‘hereby’, ‘herein’, ‘hereupon’, ‘hers’, ‘herself’,
‘him’, ‘himself’, ‘his’, ‘how’, ‘however’, ‘hundred’, ‘ie’, ‘if’, ‘in’, ‘inc’, ‘indeed’, ‘interest’, ‘into’, ‘is’, ‘it’,
‘its’, ‘itself’, ‘keep’, ‘last’, ‘latter’, ‘latterly’, ‘least’, ‘less’, ‘ltd’, ‘made’, ‘many’, ‘may’, ‘me’, ‘meanwhile’,
‘might’, ‘mill’, ‘mine’, ‘more’, ‘moreover’, ‘most’, ‘mostly’, ‘move’, ‘much’, ‘must’, ‘my’, ‘myself’, ‘name’, ‘namely’,
‘neither’, ‘never’, ‘nevertheless’, ‘next’, ‘nine’, ‘no’, ‘nobody’, ‘none’, ‘noone’, ‘nor’, ‘not’, ‘nothing’, ‘now’,
‘nowhere’, ‘of’, ‘off’, ‘often’, ‘on’, ‘once’, ‘one’, ‘only’, ‘onto’, ‘or’, ‘other’, ‘others’, ‘otherwise’, ‘our’, ‘ours’,
‘ourselves’, ‘out’, ‘over’, ‘own’,'part’, ‘per’, ‘perhaps’, ‘please’, ‘put’, ‘rather’, ‘re’, ’same’, ’see’, ’seem’, ’seemed’,
’seeming’, ’seems’, ’serious’, ’several’, ’she’, ’should’, ’show’, ’side’, ’since’, ’sincere’, ’six’, ’sixty’, ’so’, ’some’,
’somehow’, ’someone’, ’something’, ’sometime’, ’sometimes’, ’somewhere’, ’still’, ’such’, ’system’, ‘take’, ‘ten’, ‘than’,
‘that’, ‘the’, ‘their’, ‘them’, ‘themselves’, ‘then’, ‘thence’, ‘there’, ‘thereafter’, ‘thereby’, ‘therefore’, ‘therein’,
‘thereupon’, ‘these’, ‘they’, ‘thickv’, ‘thin’, ‘third’, ‘this’, ‘those’, ‘though’, ‘three’, ‘through’, ‘throughout’, ‘thru’,
‘thus’, ‘to’, ‘together’, ‘too’, ‘top’, ‘toward’, ‘towards’, ‘twelve’, ‘twenty’, ‘two’, ‘un’, ‘under’, ‘until’, ‘up’, ‘upon’,
‘us’, ‘very’, ‘via’, ‘was’, ‘we’, ‘well’, ‘were’, ‘what’, ‘whatever’, ‘when’, ‘whence’, ‘whenever’, ‘where’, ‘whereafter’,
‘whereas’, ‘whereby’, ‘wherein’, ‘whereupon’, ‘wherever’, ‘whether’, ‘which’, ‘while’, ‘whither’, ‘who’, ‘whoever’, ‘whole’,
‘whom’, ‘whose’, ‘why’, ‘will’, ‘with’, ‘within’, ‘without’, ‘would’, ‘yet’, ‘you’, ‘your’, ‘yours’, ‘yourself’, ‘yourselves’,
‘the’);

$stopwords = array_merge($stopwords, $extraStopWords);

// placing each word on a separate line
$command = ’sed -e “s/[^a-zA-Z]/\n/g” ‘ . $filepath;
$command .= ‘|’;
// striping out the empty lines
$command .= ‘grep -v “^$”‘;
$command .= ‘|’;
// adding lines combining all adjacent two words
// N.B.: I am commenting the single quotes inside the command
$command .= ‘awk \’(PREV!=”") {printf “%s\n%s %s\n”, PREV, PREV, $1} {PREV=$1}\”;

$command .= ‘|’;
// stripping out common stopwords (the actual command is something like this: grep -Ev ‘(\bis\b|\bsuch\b)’)
$command .= ‘grep -Evi “(\b’;
$command .= implode(’\b|\b’, $stopwords);
$command .= ‘\b)”‘;

$command .= ‘|’;
// removing all the words shorter than $minWordLength characters
$limit = $minWordLength -1;
$command .= “grep -Ev ‘^[a-zA-Z]{1,$limit}$’”;
$command .= ‘|’;
// N.B.: we are commenting the single quotes inside the command
$command .= ’sort | uniq -c | sort -nr’;
$command .= ‘|’;
// stripping out the numbers we use for sorting
$command .= “sed -e ’s/^[^0-9]*[0-9]* //g’”;

$command .= ‘|’;
$command .= ” head -n $numberOfTerms”;

$commandOutput = shell_exec($command);

$commandOutputLines = explode(”\n”, $commandOutput);

// sanitising the return
$ret = array();
foreach ($commandOutputLines as $commandOutputLine)
{
$commandOutputLine = trim($commandOutputLine);
if ( strlen($commandOutputLine) > 0 )
{
$ret[] = $commandOutputLine;
}
}

return $ret;
}

Install KAlarm on Gnome

May 29th, 2010

You have to install kdepim:
_ yum install kdepim (on RedHat and CentOS)
_ apt-get install kdepim (on Debiam/Ubuntu)

By doing that you can use KAlarm (that actually is a KDE program)

How to Create a Screencast / Videocast / Video

December 20th, 2009

Use Camstudio this way:
_ set all the relevant settings
_ use pause with keyboard shortcuts frequently
_ record different cuts and put them together with VirtualDub

How to do that with VirtualDub?
_ Open a file
_ Video > Direct Stream Copy
_ File > Append video segment
_ File > Save as AVI

How to Download Movies & Files on Linux

December 20th, 2009

_ Find the torrent with http://www.mininova.org/ or search on Google “Gladiator download torrent”
_ Use Transmission to download the torrent
_ opersubtitles.org

how to install flash in Firefox for CentOS

December 19th, 2009

* Go to youtube.com and try to play a video. You should not be able to play it
* Instead of the video, there should be a link like ‘Install Flash’
* You should be redirected to the Adobe website where you can download an RPM package
* After downloading it, install it with the rpm -i command (as root)
* Then, launch:
sudo yum install flash-plugin

PHP - Nice Debug Trace

November 20th, 2009

<?php
echo parse_backtrace(debug_backtrace());

function parse_backtrace($raw){

$output=“”;

foreach($raw as $entry){
$output.=“\nFile: “.$entry[‘file’].” (Line: “.$entry[‘line’].“)\n”;
$output.=“Function: “.$entry[‘function’].“\n”;
$output.=“Args: “.implode(“, “, $entry[‘args’]).“\n”;
}

return $output;
}
?>

CSS Sprite Online Generator

November 7th, 2009

http://spritegen.website-performance.org/

You can’t include animated gifs in the sprite (otherwise they will be still).

You can’t include list bullets because if the list item is quiet long, the images under the bullet point icon in the sprite will be shown.

Installing Memcached and PHP Memcache on CentOS and Ubuntu

November 4th, 2009

On CentOS:

Install Memcached
_ yum install memcached
_ /sbin/chkconfig memcached on
_ /etc/init.d/memcached start
_ vim /etc/sysconfig/memcached, and edit:
OPTIONS=”-l 127.0.0.1″

Install PHP extension for Memcache:
_ yum install zlib-devel
This will prevent this error from occurring:
configure: error: memcache support requires ZLIB. Use –with-zlib-dir=<DIR> to specify prefix where ZLIB include and library are located
_ pear install pecl/memcache
You should have:
/usr/lib/php/modules/memcache.so
_ echo “extension=memcache.so” > /etc/php.d/memcache.ini
_ /etc/init.d/httpd restart

On Ubuntu:

I’ve installed: apt-get install php5-memcache
I also have memcached installed: apt-get install memcached
restart apache
start memcached

Minify CSS and Javascript Files With YUI Compressor From Linux Command Line

November 2nd, 2009

You can download YUICompressor from here:

http://yuilibrary.com/downloads/#yuicompressor

Extract the package

You need Java to execute it (on Ubuntu):

sudo apt-get install sun-java6-jre

To use it, you can execute this on the command line:

cat jquery-*.min.js library.js common.js lists.js tasks.js contexts-mgmt.js keys-mgmt.js | java -jar /home/dan/Desktop/yuicompressor-2.4.2/build/yuicompressor-2.4.2.jar –type js -o all`date +%Y%m%d%I%M%S`.js

Symfony - Creating a New Plugin

October 29th, 2009

You need this just once:

_ sudo apt-get install php-pear
_ sudo pear channel-discover plugins.symfony-project.com

For every package:
_ Create the file package.xml and put it on the root of the plugin
_ launch:      pear package

To update the repository, use this URL:

http://svn.symfony-project.com/plugins/plugin_name