uni_db.awk: gawk script to convert a Unihan.txt "tagged" database file to a relational SQLite 3 database named unihan.sl3. Prerequisites: the Gnu gawk package and the SQLite three package should be installed on the system. Also, the latest version of the Unihan.txt database from Unicode.org should be downloaded in a zipfile, unihan.zip. If there are any problems with paths, create a directory called, let's say, "unihan", and copy the sqlite3.exe and gawk.exe files into that directory. Unzip the unihan.zip file so that Unihan.txt is also in that directory. Now you can run the command line: gawk -f uni_db.awk Unihan.txt Depending on the speed of your system, the process can take about a minute to much longer. Note that a good amount of disk space is required for the Unihan.txt file, temporary files, and the unihan.sl3 relational database. To be on the safe side, I would recommend 150 Megabytes of free disk space. To test unhihan.sl3 enter the command line: sqlite3 unihan.sl3 and the prompt will change to 'sqlite> ' Enter the following query at the prompt: sqlite> select * from kFenn limit 10; and you should see the first ten records from the kFenn file. For a list of the options availabe, enter: sqlite> .help These options could be entered in a command file, mathews.sql: .mode csv .output mathews.csv select kBase.char, kMatthews.v from kbase, kMatthews where kbase.k = kMatthews.k .output stdout This file could be executed by entering: sqlite> .read mathews.sql You should have a file, mathews.csv which can be directly loaded into a variety of spreadsheet programs. Of course, this only scratches the surface. By joining the files in different ways, it is possible to greatly refine your result set. I would recommend W3schools.com tutorial in SQL as a good starting point, and keep the SQLite version SQL documentation bookmarked in your browser. Also, if you have access to a more sophisticated DBMS, you might consider altering the uni_db script to output the appropriate commands for creating a database on a different DBMS, or exporting the unihan SQLite table as a SQL dump, and then modifying it through a filter like gawk.