| The Free Advertising Forum http://freeadvertising.forez.ws/ |
|
| Text Analysis Tool http://freeadvertising.forez.ws/viewtopic.php/229..http-freeadvertising-forez-ws-viewtopic-php-f-10andt-229 |
Page 1 of 1 |
| Author: | Forez [ Sun Apr 19, 2009 11:25 pm ] |
| Post subject: | Text Analysis Tool |
Hi there buddies, I wrote a simple tool that might help you with your advertising and publishing efforts. What this tool does is making a word and phrase analysis on any given text. Currently the tool can only use local files as a text source, but in future versions it would be able to fetch text directly from web pages. This text analysis tool will find any repeating words in a given text as well as repeating phrases. It can search for repeating phrases up to one hundred words. It sorts the repeating words descending according to the count of their occurrence in the text and calculates each word's density. This text analysis tool has a native utf-8 encoding support so it should be able to analyze text in literally most of the languages worldwide (I haven't done a lot of testing on that last one). Typical usage: > If you run pay-per-click campaigns you probably know that the ad-words robots estimate the relevance of your ad to the page you promote basically by finding repeating phrases and word across your PPC ad and the promoted web page. Thus this text analysis tool can give you a clue which words and phrases are most often repeated in the web page's text and which of them you should use to optimize your ads and lower your advertising costs. > If you publish articles or do search-engine optimization (for example) you would probably want to know how many words is your article/web-page and which are the most repeated words and phrases. > Also this tool can be used with another rather untypical purpose - program code analysis and optimization. For example if you want to make your code smaller you may use the text analysis tool to tell you which are the most repeated instances of code and thus shorten the variable names or substitute repeating chunks of code with something shorter. Analysis mechanism and to-do-es: > Currently the tool reads the entire text and strips all the punctuation while trying to preserve some special type of words like: URL addresses, e-mail addresses and numbers. So keep in mind that if the end of one sentence and the beginning of another appear more than once in a particular text, this may appear as a repeating phrase in the analysis result. > Also the tool equalizes all the text to lower case while still trying to preserve the special type of words mentioned above. So the analyzer do not make any difference between the letter case (unless the words are "special"). In later versions of the program there should be an option for a case sensitive analysis. > As I mentioned above the tool should be able to fetch text directly from web pages but for now this functionality is under development. Installation and requirements: > This text analysis tool is written in pure PHP. The program uses the GTK2 library for the graphical user interface as well as the multi-byte-string library for the cross language support. I bundled all the necessary libraries and executables with the program scripts so all you have to do is unzip run... I haven't tested this tool under Linux environment but I believe it would work pretty well as it works under Windows. > The package doesn't mess with any registries and configuration files so it will not conflict with other installations of PHP for example, that you might have. Final notes and terms of use: > As I mentioned earlier this text analysis tool is still under development. There are a lot of fixes and optimization to be made. However it still serves it's general purpose pretty well at that state of development. You can give it a try and tell me what you think by posting replies here... > This tool is free for use and redistribution. You can do whatever you like with it and the source code. You can tune it the way it serves you best. Thanks for your interest. > And one more thing: I shouldn't be kept responsible for any harm that this text analysis tool may cause to you, to others, or any property. Use this tool solely at your own risk (when analyzing too much text) The source and the executables can be obtained from http://forez.ws/Downloads/ Thanks again for your interest and have a nice day... |
|
| Page 1 of 1 | All times are UTC |
| Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group http://www.phpbb.com/ |
|