Start a new topic

CDB Attribute Allocation

Original Post by: RyanFranz Thu Apr 10 21:41:56 2014


In section 5.7.1.8 of the CDB 3.2 spec, there is a table that shows which shapefile level that an attribute should appear. In CDB 3.0, it was very explicit that the attribute appears in only one place (except for a mistake or two). In CDB 3.1, the table was introduced, but it had certain preferences for attributes to appear in certain levels, and they mostly made sense. Here in CDB 3.2, it looks like the preference is to put all the attribution in the instance-level (see all the green "P" entries). Was this intended?


If so, it seems that it does away with the reason to have the different attribute levels, which was to reduce the size of the *.dbf files in the shapefile layers. Maybe the I/O for reading one large file is better than reading three small DBF files, but it seems that the spec is going to make me read all the files anyway, because I don't know where any attribute might be.


Original Post by: B. Leclerc Mon Apr 14 18:00:15 2014


Ryan,


It is the intention of CDB 3.2 to promote storing all CDB attributes with the instance and avoid writing the class file altogether. That explains why Table 5-27 indicates a clear preference to store attributes at the instance level. Now, for the consumer of these files, you have basically two choices: 1) you start by reading the Shapefile of the instance and if a record has a value for the CNAM attribute, you read the Shapefile of the class; or 2) you systematically read both Shapefiles and deal with the absence of the class file when none is present. Note that the same reasoning applies to reading the Shapefile containing extended attributes; you may decide to wait for the presence of values in the CEAI, GEAI, or VEAI attributes before reading the Shapefile of the extended attributes, or you may attempt reading it before parsing the instance or class files in case there are extended attributes referenced by a particular feature.


We decided to promote the sole use of the instance Shapefile after reviewing their use in typical 3.0 databases. It was a premature optimization in CDB 3.0 to factor out class attributes that are common to several instances of a feature. The savings are not worth the cost of the I/O. And I’m not even talking about the added complexity in the code. For compatibility, it is not possible to discard class Shapefiles but we can recommend storing all attributes with the instance feature… because this was already mentioned in CDB 3.0 in section 5.3.1.2, <i>If the classname is not used, its value is set to blank, and all of the classname attributes must be added to the instance-level *.dbf file</i>.


Bernard

Original Post by: RyanFranz Mon Apr 14 18:49:02 2014


Thanks Bernard! I can see that the smaller files that result from the class-level attributes don't really make up for performing another read file I/O. From my experience, it worked best for GT points and GS light points, where you had the same model/light was used over and over again, but not as well for GS points. I just didn't know what direction the spec was intending to go.


Will the class-level attributes show up as deprecated in a future spec?


Thanks again!

Original Post by: B. Leclerc Mon Apr 14 18:53:13 2014


Effectively! The intent is to eventually deprecate class-level attributes in a future release of the Spec.

Original Post by: RyanFranz Mon Apr 14 20:44:08 2014


This might be nice to mention if there is a CDB 3.2 clarification

Login to post a comment