We are building a system to infer PADS descriptions of ad hoc data formats and to generate tools to manipulate such data automatically. We currently build (1) a tool for converting ad hoc data into a canonical form of XML with a corresponding XSchema, (2) a tool for converting ad hoc data into a more regular form that may be suitable for loading into a relational system such as a database or an Excel spreadsheet, and (3) a statistical analysis tool we call an accumulator.

To try a demo, select one of the ad hoc formats below. Pressing submit will cause the learning software to process the selected format, returning the example data and the inferred description on the resulting page. From there, you will be able to run any of the generated tools. The 'Roll your own' selection lets you enter your own data.

Computing the description may take a minute or so, depending upon the speed of the machine hosting the demo and the complexity of the data.

Data sources

This work is the product of a collaboration between AT&T, Princeton University, and Galois.

It was partially supported by DARPA and the NSF.