GDS II Data Format

The GDS format from Calma has a long history in EDA.  The GDS data format lives on today despite a far superior data format known as OASIS.  I once had a couple of Calma systems, and here’s the GDS II spec from those days.

Computer chip development teams have adjusted the GDS spec over the years from what was originally written.  Here are a list of features, issues, and ad hoc extensions of the GDS data format.  Not all of these are a good idea, so, like specs, read carefully.

Data Extends Beyond 32-Bits

The GDS spec is limited to a 32-bit signed range for specifying its coordinates.  However, some folks have figured out how to exceed that limitation.  One may do it by placing a large enough cell at the edge of the 32-bit range so that it’s data extend beyond the 32-bit range.  For example, placing a cell at coordinates x=2,000,000,000 with a width of 200,000,000 then places the right edge of that placed cell beyond the 32-bit limit of 2,147,483,647.  Problem is what does that mean since not all tools may interpret it the same way.  Some tools may just not notice at all that 32-bits have been exceeded, and so the portion that fell over the edge then may wraparound over to the other side like toroidal or Asteroids coordinates.  Or maybe going beyond the 32-bit boundary means it really does go beyond instead of wrapping around.

Conclusion:  If you exceed 32-bits with GDS then make sure you know what your downstream tools do with it.  And they are not going to do what you want.  This is best avoided entirely.  If you want >32b coordinates, move to OASISTM.  Quickly.

User Units

Conclusion:  Ignore them.  Nothing to see here.

Spaces in Cell Names

The STRNAME is the structure name, aka cell name.  Some people actually use spaces in their cell names, but this should not be an extension.  This is forbidden in the spec.  And in the OASIS spec.

Conclusion:  Don’t do this and everyone can sleep peacefully.

Negative Absolute Width Paths

If the width of a path is negative, that means that the its width is absolute.  In this parlance, absolute means that do not change the width of the path if there are magnifications in the cell hierarchy that contain the path.

Conclusion:  This is insane.  Never do this.

Absolute Angles and Absolute Magnifications

If someone is using these, they may have a functioning Calma.  Otherwise, forget it.

Conclusion:  Avoid absolutely.

Drop The BOUNDARY’s Last Point

Polygons are described by BOUNDARY records.  But they require the first and last points in the XY buffer to be the same.  As you might gather, many programs simply drop the duplicated last point in the XY buffer for BOUNDARY records.

Conclusion:  Support this.  Please.

Drop ENDSTR Records

Every cell (well, structure in GDS parlance) must end with an ENDSTR record.  But if the next record is a BGNSTR to start a new cell or the end of the file with an ENDLIB, then why bother with that cell’s ENDSTR record?  After all, that will save four bytes.  Yes, some folks do this, but it is unclear why given just how bad GDS squanders space otherwise.

Conclusion:  Support this minor transgression of the spec to save a meaningless amount of space.

More Layers and Data Types Beyond 0 to 63

A data type in this parlance means a sub-layer of a layer.  You essentially think of a layer divided into sub-layers indexed by something called the “data type”.  The GDS spec requires both to be in the range of 0 to 63.  However, the space reserved is a two-byte integer.  That means we can extend the range of layers and data types to -32,768 to 32,767 or 0 to 65,535.  Given the two modes for the extension, its best to go for the unsigned integer and interpret the extra bits as 0 to 65,535.

Conclusion:  Support 0 to 65,535 layers.  Do not support negative layers.

Odd-Width Paths

The GDS spec allows one to encode paths of odd-width.  At first hand, one might think that’s not so bad.  But everything on a chip must be written to a manufacturing grid.  A typical grid is 1 nanometer.  So if you have a path that goes from point (0,0) to point (100,0) of width 5, then it is a bit difficult to manufacture since the edges of the path are y=2.5 and y=-2.5 nanometers.  Very few tools get odd-width paths correct (some do), so they should be avoided at all cost.  OASIS forbids them by referring only to a half-width, which makes translation tricky.  Do you maintain the path construct or do you maintain the edges?

Conclusion:  Tell layout designers to stop making odd-width paths or you’ll force them to move to OASIS.

Rotate AREFs Properly

GDS AREF records have confused many tool developers, even venerable layout tools written in the 1980’s.  Perhaps it was due to said tool writers having no Calmas to debug their programs.  Most of the confusion lies in the rotation of the array of cell references that they AREF is meant to represent.  The presentation of the GDS AREF record basically is:

<AREF>::=  AREF SNAME [<strans>] COLROW XY

The issue straddles the STRANS record, and the XY record.  The STRANS records an angle, and the XY record has three points interpreted as two vectors as represented in the diagram below.  These latter two vectors give the rotation of the array’s entries so that the rows and columns are perpendicular to those two vectors.

Some readers of the spec assume that since two encodings for angle have been given, then these two encodings must describe different things.  For instance, that could mean that the vectors from the XY record give you the orientation of the rows and columns as in the figure above, and the angle in the STRANS record independently and uniformly rotates the entries in the array.  But that’s not what it means.  The two terms are redundant, not independent.  This is simply due to the level of hardware at the time that had no floating point support.  Also recall rectangles requiring five points.  That’s due to the simple drawing program.  Drawing AREFs seems harder than drawing rectangles, and the redundant info is meant to help the drawing program.

Conclusion:  Rotate your AREFs properly, or you’ll need magic bits to turn on/off that feature so downstream tools can understand your output.

STRNAMEs and SNAMEs Longer Than 32 Characters

The STRNAME is the cell, or structure, name when data for the cell begin in the file.  The SNAME is used in a reference to place that named cell at a location specified by the reference.  The GDS spec restricts cell names to a max 32 character length.  That’s a bit short these days, so read on about extending STRING records, which look structurally the same as these two.

STRINGs Longer Than 512 Characters

The GDS spec limits the STRING record to 512 characters.  The space reserved by a GDS record is specified by a two byte unsigned integer.  Since GDS records must be an even number of bytes, the max record size is 65,534 bytes.  Since four bytes form the header of a GDS record, the effective space left is 65,530 bytes.  This extension gives GDS STRING strings a max length of 65,530 characters.

Conclusion:  Support long strings, but do not bother to concatenate these records to make arbitrarily long strings.  Strings as long as 64K are not so common anyways, and if one wanted such one should move to OASIS anyways.

Polygons or Paths With More Than 200 Points

The GDS spec only allows for 200 points per polygon or path center line in the XY record.  That is a significant limit, so get rid of it and allow filling up the full 65,530 bytes with points instead of only 200 * xy = 200 * 4 * 4 = 3200 bytes.  Hence we can get 65,530/8 = 8191.25, or 8191 points on a polygon or path center line.

What sort of polygon on a chip has more then 8K points?  Well, some things like ground networks, meshes like in a grill, or reverse tone of a contact layer (a single rectangle subtracting out millions of little rectangles inside it makes for one big polygon) and so on.

Some tools may extend this limit by allowing contiguous XY records to be concatenated together in the output stream.  Unfortunately, this has limited support in most tools, meaning it won’t work in most flows.  It is best to avoid GDS if you are going to have a lot of points on your polygons or paths.  Instead, use OASIS which has no internal limit on the point count.

Conclusion:  Support this, but only for a single XY record.  Do not support contiguous XY records which has low downstream tool support in flows.  Instead, just move to OASIS to get past the 8191 point count limit anxiety.

Must Strings Have Nulls?

Many people use the C programming language, and many people think that strings must have null bytes (a zero byte) attached to the end of every string.  However, this is half true in GDS, and totally false in OASIS.  Strings stored in OASIS leave off the null byte entirely (which can be a pain in its own way).  In GDS, however, all records are an even byte length due to the 16 bit processor running the records.  The string “foo” encoded in a GDS STRING record encodes as “foo<null>” with a terminating null byte, but “food” does not as it fits comfortably in two bytes.  Getting back to the C programmers, this detail does not matter and therefore all strings written through their private GDS program adds a null byte to the end of “food”, therefore making the STRING record an odd length.  This causes quite a bit of havoc with a parser well-tuned to expect even length GDS records or accessing records by pointers.

Conclusion:  Do not write nulls at the end of every string.  Read the spec first.  Do not support this unless you have some customer beat you up over it.  Then send me their name.

Compress All GDS Files

GDS is not a space efficient format.  After all, it insists that it takes five points to describe a rectangle, where the first and last points are equal (as for all GDS polygons).  A design team’s natural instinct is to compress all .gds files to .gds.gz to save a fantastic amount of space.  Unfortunately, it also gives a lot of headaches.  If all your program wants to do is a linear read of the file, then one can deflate a bit of the file, parse it, deflate a bit, and so on, and that’s not so bad. Some tools in the flow may require deflating the entire file.  That will be bad.  Not just for the very large file size (>TB can be expected per layer file), but also for the network bandwidth that dealing with such huge files requires.  The difference between GDS and OASIS can be 10x file size, implying 10x longer on the network as well.

If you’re clever enough to randomly access your layout file, for example you want to start reading a cell at a given offset, then you cannot do this with a .gds.gz file.  To start reading in the middle of a .gds.gz file, one has to start deflating from the file’s beginning all the way to where you want to go.  Not much point then.

Better is to just use OASIS which supports internal compression and random access at the cell level.

Also do not make .oas.gz files, otherwise you just lost a lot like being forced to re-compress/decompress bits of the file that you may not want plus are already compressed.  If someone says but all our flows require compressing the files, smile and nod and walk on.

Conclusion:  Do not compress GDS files.  Instead move to OASIS with internal compression, and do not compress OASIS files.

BOX Records

Some folks think that since there is something called a BOX record, which is different from the typical BOUNDARY record for polygons, then BOX must be an efficient rectangle.  After all, the BOUNDARY is horribly inefficient, so BOX must be better, right?  Recall that in the GDS BOUNDARY record that it takes five points to define a rectangle.  Well, the GDS spec for a BOX record says it must have five points.  Actually, it’s the XY record following the BOX record that must contain five points, the first and last of which must be the same.  Note that that does not specify a rectangle, much less a two point rectangle.

Give the prior screed about GDS extensions, then BOX records would consume any number of points limited by a single XY record.  We are left with the conclusion that BOX records provide a sort of alternate universe of polygons which could have been equally described as BOUNDARY records.

Conclusion:  Do not use BOX records for anything.  Make the world a better place by not allowing saving to BOX records.

Property Attributes Beyond 1 to 127

There are two bytes to encode 1 to 127.  You can imagine what it was extended to.  There were reserved property attributes, but there are no Calmas any more and so we no longer worry about what they were reserving.

Unfortunately GDS uses an integer to map to a value which is a string.  OASIS has a string-to-string mapping that is more informative since folks usually don’t remember what property attribute 10 vs 11 means.  This makes for a major disconnect in the translation of GDS to OASIS.

Conclusion:  Support extending property attributes from 0 to 65,535.

Property Value Length Beyond 126

A property value is a string, but is limited in length to 126 characters.  This limit, and others, was due to the machine running the Calmas rather than a limitation in the data format itself.  Hence, this opened up just like strings.  Also see the part above about not always putting nulls on the end of strings.

Conclusion:  Support this, but take care about ending with nulls.

Leave a Reply