Uses and Intentions

[draft]

 

Note:

The DDL project homepage is http://ddl.sscli.net .

You can mail the authors at spark@sscli.net and dolly@sscli.net

 

File Format Support

The main intention in creating the DDL is that it should be used as a Rapid-Application-Development tool for developing file handling applications. The DDL engine, standalone, is not a useful program. It is useful as part of a developers toolkit. The engine is provided as a run time library or a static library which attaches itself to the host program written by the developer. The DDL can then be used to do the file handling for a host program. The engine will load a specified 'DDL script' file and a specified data file and map the script to the data. It can then be queried by the host program for values from the data file.

 

For this purpose the developer of the application must

  1. Obtain a format specification of the data file that he/she intends to support.
  2. Formalize the specification into the syntax of the DDL so that it can be communicated to the DDL interpreter/engine.
  3. Obtain a copy of data source file.
  4. Test the specification file written by using any of the DDL engine development tools provided and do sample runs on the data file and ensure that values are read as expected.
  5. Once the script has been tested, the developer can enable his application with the DDL engine. The DDL engine will be available as a compile time library module or a runtime dynamic link library.
  6. Once the integration is complete, the script file and data file can be loaded into the engine programmatically by the application source code and can be queried programmatically for values.

 

 

 

Enabling file format support through the DDL goes only as far as being able to retrieve values from the data source. Each value should be accessed through a unique name. The DDL does not attach any special meanings to any of these values nor does it say what you do with these values.

 

The DDL can be considered as an additional API layer above the basic file handling exposed by the operating system. While it does not do anything that is not possible with the underlying API, it makes the job substantially simpler and error free.

 

Prototyping Tool

Any tool that lends itself to easy development and fast authenticity verification is by nature a good prototyping tool. DDL can be a good prototyping tool in cases

·         where there are ambiguities about the exact file format or about the exact values of the variables in the file format and these need to be verified before code finally needs to be written

·         where a new file format is being developed

·         where a project has a short development time available and extensive debugging time may not be available

·         where it is needed to get the project up and working before each section is implemented according to its finally expected specifications.

 

While the DDL is reasonable efficient in its file handling and internal algorithms it may not be as fast as efficiently written tight C code for one specific format type. So in cases where extreme efficiency is needed custom code is always a grade faster. In such cases the DDL could be a good test bench for developing the specific C code.

 

Extensions and Applicability

Address Specification extension

This section discusses the near future developments for the DDL. The current version of the DDL handles all the addresses of variables within a data block as bit offsets, as previously discussed.

 

This is a simple kind of offset addressing. Of course this is not without its quirks but this is definitely the only addressing scheme possible. This form of addressing is a single bit numbered addressing. This is true of any bit stream.

 

Extensions are required to the language to support for multiple bit stream data formats – like block devices as opposed to char/bit stream devices. When multiple bit streams are to be supported the addressing will be a combination of specifying a particular block/stream and the addresses within that block and stream.

 

Digression on Map-ability

The language specification of the DDL simple defines data structures that exist within the data source. The language per se does not impose any qualification on the source of the data. Thus a data structure maybe mapped onto data from any source. This is an important concept that must be understood. In all the previous discussions the data source was treated as a file. The data source need not be a file.

 

Thus if the DDL was extended to handle any kind of data source the range of applications expand a lot. Most of the data source specifications are ways of saying which bit stream to use. The way we presently intend is to have combinations of numbers and strings that can specify a data stream. This covers a broad category of data sources

.

Data Source and Stream Name conventions

A Single File.
This is the common usage of the DDL where a location is just the offset of a bit position. This is true of any data that is within a single file. The exact bit offset will identify the data unambiguously.
In-File Address = Number

Memory.
Addresses within the PC memory are again just linear addresses or a combinational address where there is a segment and offset.
Memory Address = Number

Hard Disks and other Diskettes.
Addressing on HDDs and FDDs are usually in 2 ways – the CHS or the LBA addressing. In the CHS mode the Cylinder(number), Head(number) and Sector identifies a particular data block (a sector) and a further offset address will identify unique bits within the block.
CHS Disk Addressing = Number, Number, Number + Number

For LBA a block/sector is accessed via a number, the logical block address. Also the offset address within the block
LBA Addressing = Number + Number

File Systems.
This is when the data you are trying to define is scattered across various files across a file system. The Addressing would therefore have a path that identifies a file and the offset address within the
File System Addressing = String + Number

Networks.
In terms of networking a data source can be the actual data that is readable from a communications port or software port. This is the functioning under the protocol level. In such a case the specification would involve a hardware or software port id and some id of the communication received through that channel. The communication maybe categorized as packets and offsets within packets, as bit streams etc.
IO Communication = Hardware ID + Number
Software Port = Number (Port ID) + Number (Packet ID) + Number (Offset)

Above the protocol level a data block maybe accessed through some URI or a more complicated mechanism. Once the data block is identified, it can treated like a file and offset addressing should be possible in it.
Network Resource =  String (URI) + Number

Thus the class of addressing schemes is rather vast. It may not be feasible or advisable to implement all of these into the DDL. Some of the approaches that could be followed are

·         Modify the DDL data reader section either at source level or as a dynamic module to enable reading an unsupported data source.

·         Build in read support many of the expected data source kinds move the address calculation mechanism to the script so that the user might specify how the blocks relate to each other. For example the user would have to give code that would enable the DDL to understand that
CHS (x,y,63) + 1 is CHS(x,y+1,0)

 

Extended Uses

These are some of the expected applications of the DDL for future version where it has support for more addressing schemes. What can be noticed here is that features required for most of this functionality are already in place. These applications may not be of immediate commercial interest for a developer.

 

Scriptable File Systems

The DDL can be used as scriptable file system interpreter i.e. with the disk addressing mechanism in place it would be possible to do such things as load a script file for a particular file system type, such as say ext2fs or ntfs and read data out of file stored in that file system format.

This has tremendous applications as a educational as well as data recovery tool and would ideally belong in a hackers toolkit.


In-Memory Data Structure Debugger

The DDL can interpret data structures in any data source, even the computer memory. This facility could be used to debug the internal data structures used by an application during runtime.

Imagine that the application in question uses a binary tree. If one could locate the root address of the binary tree and script the structure of a node of the tree in the DDL the DDL could then read the values used in the program’s binary tree while it is running.

 

This kind of data structure is a rather interesting concept. Simply because most debuggers available today do tasks like bound checking, break points, tracing etc but none can effectively create a browse-able display of a running applications data structures. The concept of scriptable debuggers is still in its development. The comparable and possibly superior comparison in this respect is ‘alef’ - the language that is a debugger of plan9 -  created by the legendary Unix team of Ken Thompson, Brian Kernighan, Dennis Ritchie, Rob Pike etc.

 

Protocol Scriptable Server

Similar to file formats another (probably more) difficult implementation is the support for binary protocol formats. These are usually as cryptic as any file format is and usually are harder to support because of their dynamic nature.

Binary protocol servers could be run through the DDL, where the DDL handles the task of data wrapping and unwrapping to and from the protocol stream.