[draft]
Note:
The DDL project homepage is http://ddl.sscli.net .
You can mail the authors at spark@sscli.net and dolly@sscli.net
The main intention in creating the DDL is that it should be used as a Rapid-Application-Development tool for developing file handling applications. The DDL engine, standalone, is not a useful program. It is useful as part of a developers toolkit. The engine is provided as a run time library or a static library which attaches itself to the host program written by the developer. The DDL can then be used to do the file handling for a host program. The engine will load a specified 'DDL script' file and a specified data file and map the script to the data. It can then be queried by the host program for values from the data file.
For this purpose the developer of the application must

Enabling file format support through the DDL goes only as far as being able to retrieve values from the data source. Each value should be accessed through a unique name. The DDL does not attach any special meanings to any of these values nor does it say what you do with these values.
The DDL can be considered as an additional API layer above the basic file handling exposed by the operating system. While it does not do anything that is not possible with the underlying API, it makes the job substantially simpler and error free.
Any tool that lends itself to easy development and fast authenticity verification is by nature a good prototyping tool. DDL can be a good prototyping tool in cases
· where there are ambiguities about the exact file format or about the exact values of the variables in the file format and these need to be verified before code finally needs to be written
· where a new file format is being developed
· where a project has a short development time available and extensive debugging time may not be available
· where it is needed to get the project up and working before each section is implemented according to its finally expected specifications.
While the DDL is reasonable efficient in its file handling and internal algorithms it may not be as fast as efficiently written tight C code for one specific format type. So in cases where extreme efficiency is needed custom code is always a grade faster. In such cases the DDL could be a good test bench for developing the specific C code.
This section discusses the near future developments for the DDL. The current version of the DDL handles all the addresses of variables within a data block as bit offsets, as previously discussed.
This is a simple kind of offset addressing. Of course this is not without its quirks but this is definitely the only addressing scheme possible. This form of addressing is a single bit numbered addressing. This is true of any bit stream.
Extensions are required to the language to support for multiple bit stream data formats – like block devices as opposed to char/bit stream devices. When multiple bit streams are to be supported the addressing will be a combination of specifying a particular block/stream and the addresses within that block and stream.
The language specification of the DDL simple defines data structures that exist within the data source. The language per se does not impose any qualification on the source of the data. Thus a data structure maybe mapped onto data from any source. This is an important concept that must be understood. In all the previous discussions the data source was treated as a file. The data source need not be a file.
Thus if the DDL was extended to handle any kind of data source the range of applications expand a lot. Most of the data source specifications are ways of saying which bit stream to use. The way we presently intend is to have combinations of numbers and strings that can specify a data stream. This covers a broad category of data sources
.
A Single File.
This is the common usage of the DDL where a location is just the
offset of a bit position. This is true of any data that is within a single
file. The exact bit offset will identify the data unambiguously.
In-File Address = Number
Memory.
Addresses within the PC memory are again just linear addresses or
a combinational address where there is a segment and offset.
Memory Address = Number
Hard Disks and other Diskettes.
Addressing on HDDs and FDDs are usually in 2 ways – the CHS or
the LBA addressing. In the CHS mode the Cylinder(number), Head(number) and
Sector identifies a particular data block (a sector) and a further offset
address will identify unique bits within the block.
CHS Disk Addressing = Number, Number, Number + Number
For LBA a block/sector is accessed via a number, the logical block address.
Also the offset address within the block
LBA Addressing = Number + Number
File Systems.
This is when the data you are trying to define is scattered across
various files across a file system. The Addressing would therefore have a path
that identifies a file and the offset address within the
File System Addressing = String + Number
Networks.
In terms of networking a data source can be the actual data that
is readable from a communications port or software port. This is the
functioning under the protocol level. In such a case the specification would
involve a hardware or software port id and some id of the communication
received through that channel. The communication maybe categorized as packets
and offsets within packets, as bit streams etc.
IO Communication = Hardware ID + Number
Software Port = Number (Port ID) + Number (Packet ID) + Number (Offset)
Above the protocol level a data block maybe accessed through some URI or a more
complicated mechanism. Once the data block is identified, it can treated like a
file and offset addressing should be possible in it.
Network Resource = String (URI) + Number
Thus the class of addressing schemes is rather vast. It may not be feasible or advisable to implement all of these into the DDL. Some of the approaches that could be followed are
· Modify the DDL data reader section either at source level or as a dynamic module to enable reading an unsupported data source.
·
Build in read support many of the expected data source kinds move
the address calculation mechanism to the script so that the user might specify
how the blocks relate to each other. For example the user would have to give
code that would enable the DDL to understand that
CHS (x,y,63) + 1 is CHS(x,y+1,0)
These are some of the expected applications of the DDL for future version where it has support for more addressing schemes. What can be noticed here is that features required for most of this functionality are already in place. These applications may not be of immediate commercial interest for a developer.
The DDL can be used as scriptable file system interpreter i.e. with the disk addressing mechanism in place it would be possible to do such things as load a script file for a particular file system type, such as say ext2fs or ntfs and read data out of file stored in that file system format.
This has tremendous applications as a educational as well as data recovery tool and would ideally belong in a hackers toolkit.
The DDL can interpret data structures in any data source, even the computer memory. This facility could be used to debug the internal data structures used by an application during runtime.
Imagine that the application in question uses a binary tree. If one could locate the root address of the binary tree and script the structure of a node of the tree in the DDL the DDL could then read the values used in the program’s binary tree while it is running.
This kind of data structure is a rather interesting concept. Simply because most debuggers available today do tasks like bound checking, break points, tracing etc but none can effectively create a browse-able display of a running applications data structures. The concept of scriptable debuggers is still in its development. The comparable and possibly superior comparison in this respect is ‘alef’ - the language that is a debugger of plan9 - created by the legendary Unix team of Ken Thompson, Brian Kernighan, Dennis Ritchie, Rob Pike etc.
Similar to file formats another (probably more) difficult implementation is the support for binary protocol formats. These are usually as cryptic as any file format is and usually are harder to support because of their dynamic nature.
Binary protocol servers could be run through the DDL, where the DDL handles the task of data wrapping and unwrapping to and from the protocol stream.