Midnight Engineering: 2009

Saturday, December 26, 2009

How to get longitude and latitude from Google Maps

So, I have a new obsession. I've been toying around with Android and creating gps aware applications. I don't own my own Android phone yet. I'm having a hard time deciding if I want a Droid or an Eris. In the meantime, I've been hanging around the Android sites, learning as much as I can about writing Android apps. One of the areas I'm interested in is GPS aware apps, where apps change their behavior based upon the current GPS position.

The Android 2.0 SDK comes with its own Android device simulator. which can simulate an Android device that is changing its position in the real-world. One of the ways it can simulate this behavior is for a user to submit a longitude and latitude. Well, how do you get longitude and latitude coordinates if you don't own a GPS and/or the location you want to simulate is not accessible. Google Maps can be used to provide longitude and latitude coordinates.

First, go to Google Maps and enter the address of interest. The address you just entered will appear on the left-hand side. Move the mouse over the address which appears as a link. When you move your mouse over the link, the contents will appear in the bottom of your browser. In my case, I'm using Firefox. It's possible your browser may not do this. In the link that appears, look for the string that appears as follows:

ssl=X,Y

X will be the longitude. Y will be the latitude.

Tuesday, November 10, 2009

1-liner for creating symbolic links in /usr/local/bin

If you have to create more symbolic links than you can create manually, this is a useful 1-liner. In this case, this 1-liner was used to install the binaries from Java 1.6 into /usr/local/bin:

find /usr/local/jdk1.6.0_17/bin|grep -vE "bin$|ControlPanel$"|perl -ne 'system "ln -s $_";'

This sort of 1-liner is useful when the number of symbolic links to create is greater than 1. It easily scales to 100's and even 100's of symbolic links. With the grep tool, one could even easily filter out items by name. Very useful!

Wednesday, October 14, 2009

New Machine Learning API's to Explore

Today on reddit, someone asked about freely available machine learning API's.

Before the list gets buried, I'm duplicating the contents of that thread here for future exploration:

Weka - Java based ML API
http://www.cs.waikato.ac.nz/ml/weka/

Toolkit for Advanced Disrimnative Modeling
http://tadm.sf.net/

Mallet - Java based ML API
http://mallet.cs.umass.edu/

WekaUT - An extension of Weka that adds clustering
http://www.cs.utexas.edu/users/ml/risc/code/

LibSVM - SVM API
http://www.csie.ntu.edu.tw/~cjlin/libsvm/

SVMlight - A C API for svm
http://svmlight.joachims.org/

C++ API for Neural Networks
http://github.com/bayerj/arac

Torch5 - A Matlab-like ML environment
http://torch5.sourceforge.net/

R - Open source statistical package that can be used for ML
http://cran.r-project.org/web/views/MachineLearning.html

pyML - Python API for ML
http://pyml.sourceforge.net/

Rapidminer - Open Source Data Mining Tool
http://rapid-i.com/wiki/index.php?title=Main_Page

Orange - An Open Source Data Mining Tool (Python and GUI based)
http://www.ailab.si/orange/

Glue - Open Source API for reinforcement learning (Can be used with multiple languages simultaneously)
http://glue.rl-community.org/wiki/Main_Page

Vowpal Rabiit - Learning API from Yahoo Research
http://hunch.net/~vw/

Tuesday, October 13, 2009

FC9 64-bit and VMware Workstation 6.5.3 issue with VMware Tools

I was working with a VMware appliance, which was configured to use the guest OS Fedora Core 9 64-bit.

I was using this guest OS with the latest vesion of VMware Workstation, v6.5.3. I downloaded the latest version of v6.5.3 today and I upgraded the VMware Tools of the FC9 64-bit guest.

It installed but had some kind of problem. After installing the VMware Tools that came with the latest version of VMware Workstation 6.5.3, yum and the 'software updater' were unable to upgrade, remove, and/or download RPM's.

Saturday, October 10, 2009

Unable to find a C or C++ NLG open source tool this week

So, I've been exploring the area of NLG, natural langugae generation. My personal goal was to develop an application that would read a corpus and respond with either a summary of the corpus, or a response to the categories found. In either case, I wanted the summary or response to not just be a template where the noun/verb/adjective/predicates were merely filled in. That's no better than using grep.

As of this week, I can only find API's written in Java, Python, Lisp, and Prolog. Many of the listed NLG API's or applications haven't been touched in years, or are no longer available. Much to my displeasure, nothing in C or C++. I want something that will run lean, mean, and can scale to datasets over a terabyte in size.

Tuesday, October 6, 2009

Natural Language Generation

While going over some nbc's, I stumbled across the AI area of NLP. But, while observing where the areas of nbc and NLP meet, I've found a new obsession: NLG. NLG is an acronym for natural language generation. Natural language generation is text created by a computer program that appears to be human-like in readability.

I first heard about this topic in detail in my AI class in grad school at University of San Francisco. My professor, Dr. Brooks, had mentioned that researchers had been trying for years to create programs that could generate narratives for computer games. I even recall seeing on some news aggregator that someone had successfully won a writing contest with a story written by a NLG system.

At the time I was taking my AI course, I remember working for a horrible boss. Who made all of us who he saw everyday and interacted with on a continuous basis, write weekly reports. I remember wanting to write a Perl or Python script that would do this for me. I made some attempts but it was hard to get any realistic variance. It was essentially an overglorified mad lib, where the program only filled in the blanks.

I was looking for something more natural and human like.

In NLG, one takes data and has generation rules that result in text that feels as if a human wrote it. Surprisingly, if one does a search on NLG, it is a relatively new area of research. Perhaps the best introduction to this topic is on Wikipedia. From the Wikipedia area, you will find yourself on the Bateman and Zock list of Natural Language Generators(http://www.fb10.uni-bremen.de/anglistik/langpro/NLG-table/NLG-table-root.htm)

At the moment, the state of the art appears to be based upon Java and Lisp languages. Since I work in embedded systems where speed and small footprint are key, I'm intersted in implementations that are in C and can scale. I've noticed that most of the NLP and NLG systems I found do not have a database backend. This surprises me since use of a database would allow for scaling and more consistant performacne as the dataset grows.

I think I'll be experimenting with NLG to see if I can make a program that will generate an email that asks a user for info based upon an email inquiry.

Friday, September 25, 2009

Grammars

The purpose of this entry is to describe the 4 type of grammars that can be used to classify a language, and the means used to classify a language as one of the four types of grammrs.

From a linguistics and NLP standpoint, languages can be classified by 4 possible grammar types. From the most expressive description to the least expressive description, a language can be described by a grammar known as type 0, type 1, type 2, or type 3. Each of the different grammars has a common name. A given grammar can sometimes describe other grammars. A type 0 grammar can describe type 1, type2, and type 3 grammars. A type 1 grammar can describe type 2 and type 3. A type 2 grammar can describe a type 3 grammar. A type 3 grammar cannot describe any other type of grammar.

For the four types of grammars, each are composed of rules known as productions, which have the general form of w1 -> w2 . Each production rule generates a sequence of terminal tokens and non-terminals. Non-terminals are production rules that go by the lhs symbol, w1 .

A recursively enumerable grammar is also known as a type 0 grammar. A type 0 grammar has no restrictions on its production rules. Context-sensitive grammar is a type 1 grammar. A type 1 grammar is restricted to productions where the number of symbols on the rhs is equal to or greater than the number of symbols on the lhs. Context-free grammar is a type 2 grammar. A non-terminal in a type 2 grammar can be replaced by its rhs. In comparison, a non-terminal in a type 1 grammar can only be replaced if there is a production that matches the symbols on the rhs with an equivalent lhs. A regular grammar is a type 3 grammar. A regular grammar is also known as a regular expression, which is used by Perl, Python, and grep when searching on strings. A production of a regular grammar has a restricted expression. The lhs is a non-terminal. The rhs is a terminal, which is optionally followed by a non-terminal.

Grammars are also more formally known as phrase structure grammars.

G = phrase structure grammar as a set

G = (V,T,S,P)

V is a vocabulary, a set of tokens and non-tokens(non-terminals)
T is a subset of V. T is the set of terminal tokens
S is a start symbol/token that is a member of V
P is a set of production rules
N is V - T, set of non-terminal symbols/tokens

Types of grammars and restrictions on their productions, w1 -> w2
0 no restrictions
1 length(w1) <= length(w2), w2=lambda
2 w1 = A, where A is non-terminal symbol
3 w1=A and w2=aB or w2=a, where A is an element of N, B is an element of N, and a is an element of T, or S->lambda

Sunday, September 20, 2009

Tracing the boot up sequence of TAEB part 2

This entry goes into detail of the top of the main loop of the taeb script, line 81-98.

taeb,lines 81-98 is the top-level of the taeb main loop. The main loop is composed of 2 main parts. The first item in the main loop is the actual operational statement. The later item, which is actually multiple statements, are only executed if taeb has been told via the command line option, --loop, to re-execute itself after it has completed playing a nethack session.

Let's review these 2 portions by first going over the later item, since it is only executed when specified via the command-line.

taeb, line88: only reinitialize/restart taeb if local variable $loop is not equal to 0. This will happen when --loop is specified via the command-line.

taeb, line90: reset all taeb variables and states
taeb, lines 92-96: Sleep for 5 seconds before starting a new nethack game

The eval statement, which is the first statement ecountered at the start of this loop, does all the work.

taeb, line 83: sets INT handler to print message to screen and prevent starting a new game by setting $loop to 0.

taeb, line 84: execute a taeb session
taeb, line 85: reports the results of a taeb session

taeb, line84, is TAEB->play;The method play is sent to the TAEB object, which is defined in TAEB.pm.

TAEB.pm, lines 742-749: Inside this loop, each iteration is a step. At the end of each step, results are store of the step. The last step of the taeb session will print its results to the screen.

Inside each step, the following methods are called in order:

redraw
display_topline
human_input
full_input
handle_XXXX called

redraw is invoked from TAEB::Display::Curses. redraw is used to repaint the entire nethack screen at the start of a step. display_topline is also invoked from TAEB::Display::Curses, too. display_topline displays any messages received from the last step in the current step. human_input is defined inside TAEB.pm. human_input is used to get keyboard input if ai allows human control and a key is pressed. Under normal operational circumstances, human_input will not capture anything.

full_input is a wraper method that is used to capture nethack screen data and load it into the taeb database. At the start of full_input's operation, the screen scraper is reset to take new info and the publisher is turned off. The publisher is resumed after scraping and processing of the scrape is finished. After the publisher is off, then next instruction performed is process_input, which reads any input from a previous action or user and sends it to the vt, virtual terminal. Afterwords, the screen is scraped and loaded into the Taeb database.

The next step performed is that dungeon and senses variables are updated. After the update of these items, the publisher is renabled. This completes the phase of a step where the percepts are captured.

Saturday, September 19, 2009

Tracing the boot up sequence of TAEB part 1

The purpose of this entry is to describe in plain talk how Taeb starts up and executes it 'main' loop. By understanding the 'main' loop of the taeb framework, this will enable creation of Taeb agents.

taeb, line1: enables text file to be interpreted as a Perl script
taeb, line2: no questionable Perl constructs allowed
taeb, line3: ? literally says to add to @INC the the library using the literal 'lib'
taeb, line4: Use Perl library Getopt::Long to process command-line options
taeb, lines 7-20: defintion of print_usage subroutine which shows the command-line options that can be used to configure Taeb operation

taeb, line22: local variable used to control whether or not taeb will just stop when it receives an interrupt or reset and begin a new execution. See while-loop at line 81.

taeb, line23: local list that stores command-line specified options for taeb. This local list is then transferred to Taeb's configureation. See lines 24, 28, 43.

taeb, line24: local hash that is initialzed to store the hard reference to the variable $loop and the list @config_overrides.

taeb, lines 25-39: Code used to alternate operation of Taeb via command-line options. Also, used to display the options that can be specified to Taeb.

taeb, line41: Specify that TAEB.pm must be used and found.
taeb, line43: Change Taeb configuration from specified command-line options
taeb, lines 45-47: Add to Taeb configruation that no ai should be used if command-line option specified.

taeb, lines 49-77: handlers assigned to signals TSTP, CONT, TERM, USR1, and USR2 signals.

taeb, line79: Flush after every write
taeb, lines 81-98: Top of the main loop of Taeb.

Saturday, September 12, 2009

Demo AI with TAEB and try_explore part 1

TAEB ships with an AI known as Demo. one can use it as a reference for creating your own AI with TAEB. At its most simplistic level, the creation of a TAEB AI invovles the following steps:

1. Create a Perl module that inherits from TAEB::AI
2. Create a method called next_action that returns a an object of type TAEB::Action

I will be describing how next_action in Demo moves around the the Nethack dungeon via next_action sending the try_explore method. In TAEB's Demo AI, one of the actions that it can execute is defined by the try_explore method. From the current position in the Nethack dungeon, TAEB will use try_explore to find the next best tile to reach amount it's 8 nearest neighbors.

try_explore itself is merely a wrapper. The work is done first_match, which takes as an argument a type of the destination tile. In this case, the destination tile type is 'unexplored'. Tile types are defined in lib/TAEB/Util.pm from lines 73-91. first_match is part of TAEB::World::Path package. The purpose of first_match is to identify the type of data structure that will be used to search for the next tile position. Typically, for the purposes to try_explore this will be undef. After determining the type of data structure for searching, first_match calls _dijskstra, which is an implementation of the Dijkstra path finding algorithm.

Saturday, September 5, 2009

Alternate means of running a Taeb AI agent

The easiest way to use an AI agent other than Demo in Taeb is to install the new agent into your Taeb installation. If you do this, then you only need to have a config.yml in your .taeb directory that specifies this new agent.

But, if for some reason, you don't want the agent's source code to be installed into your main installation, there is another way. As before, you will still need a config.yml in your .taeb file. If you have followed the software conventions for Taeb agents, cd into your agent's lib directory;The lib directory only contains the directory 'TAEB'. Inside that directory, lib, which is considered the 'top' of your agent's code, run 'taeb'. This will grab the agent defined under 'taeb' as the agent to run.

One can get the same results if one sets the full path of PERL5LIB to the 'lib' directory of your agent. This method has the advantage that only a config.yml is needed. No need to run taeb from the specific lib directory of your agent.

Friday, September 4, 2009

TAEB keyboard commands

The keyboard commands that can be used to interact with a TAEB-based agent are defined in TAEB/lib/TAEB.pm

p - pause agent
d - change draw mode
i - show inventory
\cP - show old message
\cX - Senses
e - equipment menu
I - item spoiler data
M - monster spoiler data
\e - user input
r - refresh screen
\cr - refresh screen
q - save and exit
Q - controlled quit and exit

How to stop a TAEB AI properly

shift+Q

If you try some other key sequence, such as ctrl-c, it will just stop, but when you re-run taeb again it will pick up where you left off. Most of the time I just want to start over.

Many key commands I think are located in lib/TAEB.pm in the TAEB source.

Structure of TAEB AI Behavioral Part 3

For the personality described the the Perl module Explorer.pm, each time the method next_action is called from the base class TAEB::AI::Behavioral::Personality, the member list is scanned for the behavior with the highest urgency.

The member list this is scanned is called prioritized_behaviors, which is an array of strings that are the humand-readable name of a possible behavior of the agent. The elements of prioritized_behaviors are(listed in the order of scanning from top to bottom) is: FixHunger, Heal, FixStatus, Defend, AttackSpell, BuffSelf, Kite, Melee, ProjectFiles, Vault, Shop, Carrion, GetItems, Equip, Identify, DipForExcalibur, Wish.

When the next_action is called it first performs any post-behaviors by sending the done method to complete any steps from the current behavior, if defined. The behavior Luckstone appears to be the only behavior with an implementation of the done method. After calling done, next_action will scan the entire prioritized_behaviors array, looking fof the first behavior with the highest urgency. If there are multiple behaviors with the same urgency and the urgency is the largest, then the first one found is the most urgent behavior. As next_action goes through each behavior in the prioritized_behaviors list, it executes each behavior's prepare method. The prepare method of a TAEB::AI::Behavior instance performs setup in preperation for a behavior to be chosen as the most urgent next behavior. For example, for the Explore.pm behavior, the prepare method performs a search of the nethack maze from its current position. When next_action completes its iterations over the contents of the prioritized_behaviors, all behaviors have had their prepare methods executed, but only one behavior is the most urgent and will be converted into a nethack command, TAEB::AI::Action subclass.

Thursday, September 3, 2009

Structure of the Taeb AI Behavioral Part 2

The Behavioral AI has 4 possible personalities which are defined by the Perl pm's Explorer, ScoreWhore, Bathophobe, and Descender. Explorer.pm is the base personality. ScoreWhore, Bathophobe, and Descender are all based upon Explorer.pm.

When Explorer.pm is instantiated by the Taeb framework, a hash, %behaviors, is created. This has uses the string names of actions as keys;These keys are mapped to instances of a TAEB::AI::Behavioral::Behavior::actionStringName. Explorer.pm contains an array of strings called prioritzed_behaviors.

Each time next_action is called, the work is done in Behavioral.pm. When next_action is called, the first task accomplished is that the criticalness of each action, which we will now call a behavior, will be calculated. The criticalness of each behavior is a property which is called urgency, which is either a string or numeric value. The string values of urgency are critical, important, normal, unimportant, fallback, and none. The corresponding numerical values are 50, 40, 30, 20, 10, and 0.

The method next_action calculates the urgency for each of the behaviors in the prioritized_behaviors. The urgency is calculated by the find_urgency method. For each behavior, find_urgency examines the current state of the agent with respect to the environment and sets the urgency member value. As the urgency is calculated, next_action is always looking for the behavior with the largest urgency. If multiple items have the same urgency value, the behavior that is in the prioritized_behavior first is the behavior that will be selected for generating a nethack action. Sometimes, no urgency is calculated. In the case of Explore.pm, the prepare method merely executes the Explore behavior each time Explore is examined during the next_action method.

Wednesday, September 2, 2009

Structure of the Taeb AI Behavioral Part 1

This is the start of a series of entries on the Taeb AI agent known as Behavioral.

Taeb is an AI agent written in Perl5. Within the Taeb framework, the Behaviorial agent would be found within lib/TAEB/AI module.

Overall, Behavioral is composed of 5 modules:

lib/TAEB/AI
lib/TAEB/AI/Behavorial
lib/TAEB/AI/Behavioral/Behavior
lib/TAEB/AI/Behavioral/Meta
lib/TAEB/AI/Behavioral/Personality

lib/TAEB/AI contains the Behavioral module and the Perl module Behavioral.pm. Behavioral.pm Behavioral.pm will cause a user to use a specific subclass of the Behavioral class.

The Behavioral module contains the Behavior, Meta, and Personality modules. The Behavior module contains the various actions an agent can perform. Meta defines the urgency type, which is used to choose a particular action. Personality contains the various types of Behavioral agents. In addition, the Perl modules Behavior.pm, Personality.pm, and ThreatEvaluation.pm live here. Every agent can accomplish some sort of action. Behavior.pm is a base class that defines the common attributes of an action. Personality.pm is the base class of the different types of Behavioral agents. ThreatEvaluation.pm is a data object used to determine the relationship between resource consumption and attacking a monster.

In the Personality module, there are 4 different types of Behavioral agents: Bathophobe, Descender, Explorer, and ScoreWhore. In the config.yml file, one specifies one of these Behavioral agent types.

Sunday, August 30, 2009

Configuring TAEB at launch to use a different AI(and other settings)

I've been playing around with Taeb and Nethack for a few days now. I'm really glad I ignored this game during college. I have a few friends who didn't and didn't finish.

I gave myself a crash course review of Perl. I mainly did this so I could understand the code for Demo.pm, which is the reference agent. I wanted to try to change agents, but I mistakenly thought this could be done with taeb's --config option.

In order to change the AI of Taeb, one must place a yml file into $HOME/.taeb. The yml file must be renamed to config.yml

Saturday, August 29, 2009

Setup of TAEB on openSUSE 11.1

TAEB stands for Tactical Amulet Extraction Bot. It is a framework written Perl. This framework enables one to make a bot that can play nethack. I'm interested in TAEB because I want to try to reduce its runtime footprint and experiment with bots that use search based AI and rules based AI. In the recent 2009 Mario AI contest, preliminary results showed that A* search implementations were beating the rules based implementations. I found this suprising because I thought rules based AI would be more flexible and faster.

I intend to do these same types of experiments with TAEB.

First step was to get nethack installed. nethack was available at www.nethack.org. OpenSUSE 11.1's current repositories alos had nethack available, so I just installed it with the OpenSUSE software installer.

I downloaded the latest TAEB using git. git clone git://github.com/sartak/TAEB.git

I discovered that if I attempted to install TAEB using sudo: sudo perl Makefile.PL
For some reason, the Perl function can_run can't find my installation of nethack. This was odd. It was necessary to create a symbolic link /usr/bin/nethack to /usr/games/nethack.

After making the symbolic link to the nethack executable, there were a number of warnings about additional Perl modules not found on the system:

I went back to eliminate each of these warnings. Unless specified otherwise, I used the cpan executable to eliminate these warnings.

Note, I found that Yaml::Scyk had to be installed but Makefile.PL did not detect and list that it was needed; When I ran taeb the first time, it complaed that it couldn't find the file YAML/Syck.pm.

After YAML::Syck was installed, I rebuilt and reinstalled. Seems to be working!

Wednesday, August 26, 2009

Naive Bayes Classifiers in SpamAssassin

Spam classifiers like SpamAssassin are broadly used to split email into ham and spam. How well would SpamAssassin's nbc perform if there were more than 2 categories. I have an idea for applying nbc of a spam filter for sorting emails which will be split into more than 2 categories.

As a start of this investigation, I've decided to start with some OSS Naive Bayes Classifer based spam filters. I'm starting with SpamAssassin. For the purposes of this experiment, I will be using SpamAssassin as a command-line tool.

spamassassin is a Perl front-end that one uses to classify an email, which is in a text file. One email per file.

sa-learn is a tool in the SpamAssassin suite that trains the nbc.
sa-learn --ham /path/to/directory/containing/ham loads the nbc with ham.
sa-leanr --spam /path/to/directory/containing/spam loads the nbc with spam.

I've only acquired a corpus of ham and spam of a few thousand emails. For what I need, I would like to have a corpus of up to a million documents which could be split into about 9 categories. I'm looking for a large corpus.

I've also noted with nbc's that process text, I've notice that there appears to be no restriction on the email size. In comparison to nbc's used with images, it is required that the images in the image corpus all be the same size. I wonder if this is really necessary. I will have to check that out with the face recognizer work currently in progress in my OpenCV project.

Sunday, August 23, 2009

Naive Bayes for text classification

Another article on Naive Bayes classification aka nbc? Why does the web need another article on this. Yes, there are many. Just use your favorite browser and search on "naive bayes". You'll get hundreds of url's. However, once you start digging deep into them, you will find 2 annoying trends. They are just reprinting the Naive Bayes definition. No specifics on how the Naive Bayes definition was used to classify a type of data.

Originally, I had leared about Naive Bayes in my AI class, which was taught by Dr. Christopher Brooks, at University of San Francisco back in 2005. Since then, the computer that held my information died and I've been unable to retrieve my nbc program. After a long time, I finally found good info on Naive Bayes and how to use it toward classifying text. First, I'll give the basic definitions of Bayes and Naive Bayes classification, which I'm simply resummarizing from the text book "Artificial Intelligence" by Russel and Norvig. Then, I'll talk about specifically applying Naive Bayes for text classification; This information I found in slides for a course called Comp221. The slides were written by the courses TA, Zhang Kai.

Naive Bayes

Like the regular Bayes algorithm, Naive Bayes simplifies the calculation of a conditional probability. It further simplifies the calculation of conditional probabilities by assuming that the effects are independent. Even though this may not be actually true, it has been found that this assumption yields acceptable behavior.

P(Class|Effects) = P(Class) * P(Effect1|Class).....*P(Effectn|Class)

Supervised Naive Bayes for Text Classification

The definition of Naive Bayes is easy to understand, but is lacking in any of the details that one must use to make a real application. I will fill in the details here(Thanks Dr. Brooks and Zhang Kai!)

(1) Start with a corpus and calculate P(Ci)
A corpus is a collection of data that will be used to train the Naive Bayes classiifer. This should be large number of items. The total number items should be around 1000. The corpus should be split into different classes, where each class occurs in the percentage one thinks the actual documents occur in real life. Ci is a class in C. P(Ci) = nc / ni, where nc is the number of corpus documents that correspond to Ci. ni is the total number of documents in the corpus.

(2)For each class, calculate the P(word|class)
For each class, there will be a collection of words that are associated with that class. One must calculate the probablity that a given word will occur in a particular class.

ni = number of total words in documents in Ci
wi = word associated with Ci
wij = number of times wi occurs in all Ci documents
P(wi|Ci) = wij / ni

For each class, if a word only occurs in the class Ci, this is considered a conditional probability of 'zero'. For an conditional probability that is a 'zero', assign it a value eta/ni, where eta/ni is some tunable value.

For each class, choose the top word frequencies as the words used to classify a document. Ideally, each word would occur in all Ci in C.

(3) After (1) and (2) have been performed, the nbc has been trained. d is a new document that is unclassified.

Take document d and find all the words that occur in the training corpus.

For each Ci, calculate P(Ci|Effects); For each word, wi, calculate the P(wi|Ci) wrt to the document d.

The largest P(Ci|Effects) is the matched class Ci for document d.

Training OpenCV for facial detection

I've been working on creating a training database of faces for OpenCV. I've acquired over 8 GB of faces. But, things are going slowly. I've been using GIMP to view the faces and determine the ROI, but its a slow process. Also, GIMP doesn't let me automatically specify a ROI. I can get the TL corner of a ROI but I have to manually post-process to get the width and length. My current method is slow, tedious, and error prone.

I'm considering making my own tool which can ouput a input database in a format that OpenCV can use. I'm going to try to do this in GIMP. But, if I can't, then I'll do something with PIL or HighGui in OpenCV.

Sunday, August 16, 2009

Generating random numbers

handy if you need to generate random numbers:

od -An -N4 -l /dev/random
od -An -N4 -l /dev/urandom

Converting the Yale database to a format that OpenCV 1.1.1 can read

The Yale face database, http://cvc.yale.edu/projects/yalefaces/yalefaces.html, is a collection of 165 faces. This is a useful source of data for training a face recognizer. I know that OpenCV already comes with trained trainers, but where is the fun in that? My trying to own face recognizer myself, I'll find out about some missing steps.

For example, the Yale Face Database files are all GIF's. OpenCV 1.1.1 does not read GIF's. They must be converted to some format that OpenCV can read. I chose to use png. but what is the easy way to convert these gif's to png's?

I used the Python Image Library and some 1-liner bash scripting magic to convert all gif's to png's.

First, I renamed all the database files to have a suffix ".gif"; The files in the Yale database don't have a suffix. I ran tests and perhaps this was not necessary but I didn't want to run into a problem with one of the files. This is the conversion I used in the directory that contains all the Yale database files:

for i in `lsxargs`;do pilconvert.py $i $i.png; done

If you see the string "lsxargs" above, this is wrong. There is a vertical bar between ls and xargs. I don't know why Blogger doesn't allow the pipe symbol.

pilconvert.py is a script that comes with the Python Image Library.

Now, if only the task of creating the rectangle for the ROI, region of interest, was just as easy.

Saturday, August 15, 2009

OpenCV and Yale database

I used the ch2_ex2_1.cpp example to attmpt to view a face from the Yale database.
The Yale database images are gifs. OpenCV doesn't read GIF's. Time for some Python and ImageMagik to convert these into PNG or JPG.

My Future OpenCV Projects

OpenCV Build Machine for openSUSE 11.1

OpenCV Build Machine for Ubuntu 8.04

Implement supervised and unsupervised recognizers using OpenCV and Yale face databases

Use machine learning API in OpenCV to create a spam filter

USB 3.0 highlights

I was reviewing the USB 3.0 protocol today. I can hardly wait to try out some USB 3.0 based products because of these features:

full-duplex
5.0 Gbps max throughput

I most excited about the full-duplex feature.

Why is my USB 2.x product so slow

USB stands for Universal Serial Bus. It was conceived by the USB-IF consortium. It was intended as a replacement for the venerable RS-232 port. It was designed to be faster, but sometimes this does not happen.

Even with USB 2.x, which has a 480 Mb/s reported speed, implementers are often surprised that their data throughput is actually slower than RS-232 or even the parallel port. How could this happen?

The reason this can happen is poor utilization of the USB transfer protocol. There are two main data transfer types in USB, isochronous and bulk. The isochronous transfer can achieve the 480 Mbps rate, but this comes at a cost. The isochronous primary emphasis is speed. In order to achieve this data rate, the isochronous protocol makes no guarantees about the data arriving in order, no guarantees of data arriving without errors, and no guarantees about the data arriving at all.

USB bulk transfers are exact opposite. It is important that data arrive as sent and in order sent. In order to achieve this, USB bulk transfers add overhead but this reduces throughput. Let's run the numbers:

Theoretical maximum throughput of USB 2.x for bulk transfers:

480 Mbs = 480 Mb / s = 480 Megabits / second = 503 316 480 bits/seconds

Time in USB 2.x

1 frame = 1 ms

1 uframe = 125 us

8000 uframes / 1 second

Max data payload size of bulk usb 2.x transfer = 512 bytes

Max bytes per frame(Table 5-10, USB 2.0 spec) = 6665

6665 bytes / uframe * 8000 uframes / s = 53248000 bytes / s = 406.2 Mb/s

406.2 Mb/s is fast but this can only be approached if the USB 2.x bulk transfer is used efficiently.

Tips on reaching the max limit of the USB 2.x bulk transfer protocol

Minimize the amount of reading and writing of data. If possible, let data only transfer in one direction. USB 2.x is a half-duplex protocol.
As much as possible, fill the entire transfer to its maximum data payload size of 512 bytes.
Directly connect the USB 2.x peripheral to the host PC's hub.

Thursday, August 13, 2009

Midnight Engineer's resume

Education:
BSEE SDSU '91
BSCS NDNU '05, 4.0 GPA,

USF for 1 year in the MSCS program

Wind River 2007 to present
Sr. engineer specializing in using the Wind River compiler(Diab) with VxWorks, and build expert for Vxworks and Wind River Linux. Recognized expert in using VMware for recreating customer design environments. Creator of library of VMware machines for engineering, sales, and support use.

Nuvation 2005 to 2007
Firmware engineer who specialized in optimizing bulk transfer USB-based products for speed. Sustaining engineering for medical product for surgery. Integrated driver into Mac OS X of a cardbus memory card. USB architect for power control module. Introduced VMware as means of organizing and managing software tools for projects. Recognized expert in USB 1.x and 2.x protocols for hardware and software.

Internet Archive 2005
Couducted searches for archive websites. Data mined terabyte-sized binary and data using one-liner perl and bash shell commands.

Xilinx 1994 to 2005

5 years, Support Engineer specializing in using Verilog and VHDL for FPGA designs, and JTAG for configuration.

5 years, Sr. Systems Design Engineer created PROM,CPLD, and FPGA programming algorithms. Created hardware configuration tools that used USB bulk transfers to configure PROM's, FPGA's, and CPLD's. MFC GUI programmer.

Languages
C,C++, Java,Python, bash shell, Perl, VHDL, Verilog, ANTLR,MFC

Tools
Eclipse, gdb,gcc,CATC USB protocol analyzer, oscilloscope, logic analyzer,Workbench,emacs

Interests
ANTLR, statistics-based algorithms for AI, programming languages, Linux, OpenCV

Monday, August 10, 2009

Installing OpenCV 1.1.1 on OpenSUSE 11.1

Special thanks to Damien Stewart who documented this procedure first.

I recently discovered the open source computer vision API called OpenCV. Originally created by Intel. It is now an open source project. It allows you to create computer vision applications that do things like facial recognition and tracking of moving objects. It's most famous implementation was in "Stanley" a robot who competed in the DARPA challenge to navigate a road without human guidance.

The procedure for installing OpenCV on a VMware Linux appliance was well documented by Damine Steward. His procedure used Ubuntu 8.04 as a basis. This is a reimagining of that procedure using OpenSUSE 11.1.

(1)Install OpenSUSE 11.1 in a VMware appliance created by VMware Workstation 6.5.

(2) After installing OpenSUSE 11.1, no need to install the VMware Tools. OpenSUSE 11.1 installs it for you.

(3) You will need to enable networking. You will need to turn off DHCP6 for your network interfaces. You may need to reboot.(4) Confirm that networking is working by open Firefox and trying to surf to an URL.
Add the Packman and Videolan repositories. Run the "Install Software" application. Insert the OpenSUSE 11.1 DVD into the cdrom drive. In the "Install Software" application:

Packages Listing->Repositories

Edit

Add

Select the "Community Repositories" radio button

Next

In the list that appears, select the Packman and Videolan repositories.

(4) Install the packages: checkinstall, yasm, libfaac-devel, libfaad-devel, libmpelame-devel, libtheora-devel, libxvidcore-devel, install portaudio-devel, twolame, libtwolame-devel , libpng3, libjpeg-devel, libtiff-devel, libjasper-devel

If you get an error message that libtheora-devel cannot be installed, choose to install the suggested alternative.

(5) svn checkout svn://svn.ffmpeg.org/ffmpeg/trunk ffmpegcd ffmpeg./configure --enable-gpl --enable-postproc --enable-pthreads \--enable-libfaac --enable-libfaad --enable-libmp3lame \--enable-libtheora --enable-libx264 --enable-libxvid \--enable-shared --enable-nonfree

make

sudo make install

(6) svn checkout http://svn.berlios.de/svnroot/repos/guvcview/trunk/ guvcview
make
sudo make install
guvcview

Your USB camarea should be working with the guvcview tool.

(7) Next, download the examples from the OpenCV book from Oreilly. http://examples.oreilly.com/9780596516130/

Unzip these examples. You will use this to test the build and install of OpenCV.

(8) Download the latest OpenCV from the trunk. As of this writing 8/11/09, the version of OpenCV is v1.1.1.
svn co https://opencvlibrary.svn.sourceforge.net/svnroot/opencvlibrary/trunk/opencv

cd into opencv
mkdir release
cd opencv
mkdir release
cd release
cmake -D CMAKE_BUILD_TYPE=RELEASE -D CMAKE_INSTALL_PREFIX=/usr/local -D BUILD_EXAMPLES=ON ../
make
cd bin

(9) ./cxtest
only 4 test fail will fail

(10) ./cxcoretest
all tests pass
cd ..

(11) sudo make install

(12) In .bashrc add: export LD_LIBRARY_PATH=/usr/local/lib:${LD_LIBRARY_PATH}

(13) cd release/bin

(14) ./lkdemo
This is the sample tracking application. You should see it work with your webcam.

(15) Unzip the OpenCV examples in directory of choice

(16) g++ -o test ch2_ex2_1.cpp `pkg-config opencv --cflags --libs`
./test stuff.jpg
You should view a picture.

(17) g++ -o test ch2_ex2_2.cpp `pkg-config opencv --cflags --libs`
./test test.avi
./test tree.avi
Both avi movies should play

And that's it! Have fun.! If you have feedback on this procedure, feel free to contact me.