Storage Backend =============== The storage backend implements the "low level" folder abstraction, which is basically an arbitary "BLOB" containing some document. The feature is that we extract "quick access" / "searchable" attributes from the document content. Further it contains the "folder management" API, as named folders can be stored in different databases. Note: we need a way to tell where "new" folders should be created Note: to sync with LDAP we need to periodically delete or archive old folders Folders have associated a type (like 'calendar') which defines the query attributes and serialization format. TODO ==== - hierarchies deeper than 4 (properly filter on path in OCS) Open Questions ============== System-meta-data in the blob-table or in the quick-table? - master data belongs into the blob table - could be regular 'NSxxx' keys to differentiate meta data from Class Hierarchy =============== [NSObject] OCSContext - tracking context OCSFolder - represents a single folder OCSFolderManager - manages folders OCSFolderType - the mapping info for a specific folder-type OCSFieldInfo - mapping info for one 'quick field' OCSChannelManager - maintains EOAdaptorChannel objects TBD: - field 'extractor' - field 'value' (eg array values for participants?) - BLOB archiver/unarchiver Defaults ======== OCSFolderInfoURL - the DB URL where the folder-info table is located eg: http://OGo:OGo@localhost/test/folder_info OCSFolderManagerDebugEnabled - enable folder-manager debug logs OCSFolderManagerSQLDebugEnabled - enable folder-manager SQL gen debug logs OCSChannelManagerDebugEnabled - enable channel debug pooling logs OCSChannelManagerPoolDebugEnabled - debug pool handle allocation OCSChannelExpireAge - if that age in seconds is exceeded, a channel will be removed from the pool OCSChannelCollectionTimer - time in seconds. each n-seconds the pool will be checked for channels too old [PGDebugEnabled] - enable PostgreSQL adaptor debugging URLs ==== "Database URLs" We use the schema: postgresql://[user]:[password]@[host]:[port]/[dbname]/[tablename] BLOB Formats ============ - TBD - iCal, XML - problem with iCal is that a complete and valid iCal file is different from just the vevent. so we basically need to parse those files in any case. - XML: the iCal SaxDriver reports XML events, so we can easily store that as XML - we need to parse the BLOB for different clients anyway (iCal != iCal ...) - XML: we could use some XML query extension to PG in the future? Update: we now have OCSiCalFieldExtractor - it parses the BLOB as an iCalendar file and extracts a set of fixed keys: - title - plain copy of "summary" - uid - plain copy - startdate - date as utime - enddate - date as utime - participants - CNs of attendees separated by comma ", " - location - partmails - sequence - TBD: iscyclic - TBD: isallday - TBD: cycles - I guess the client should fetch the BLOB to resolve - the field extractor is accessed by OCSFolder using the folderinfo: extractor = [self->folderInfo quickExtractor]; quickRow = [extractor extractQuickFieldsFromContent:_content]; Support Tools ============= - tools we need: - one to recreate a quick table based on the blob table Notes ===== - need to use http:// URLs for connect info, until generic URLs in libFoundation are fixed (the parses breaks on the login/password parts) QA == Q: Why do we use two tables, we could store the quick columns in the blob? == They could be in the same table. I considered using separate tables since the quick table is likely to be recreated now and then if BLOB indexing requirements change. Actually one could even use different _quick tables which share a common BLOB table. (a quick table is nothing more than a database index and like with DB indexes multiple ones for different requirements can make sense). Further it might improve caching behaviour for row based caches (the quick table is going to be queried much more often) - not sure whether this is relevant with PostgreSQL, probably not? Q: Can we use a VARCHAR primary key? == I asked in the postgres IRC channel and apparently the performance penalty of string primary keys isn't big. We could also use an 'internal' int sequence in addition (might be useful for supporting ZideLook) Motivation: the 'iCalendar' ID is a string and usually looks like a GUID. Q: Why using VARCHAR instead of TEXT in the BLOB? == To quote PostgreSQL documentation: "There are no performance differences between these three types, apart from the increased storage size when using the blank-padded type." So varchar(xx) is just a large TEXT. Since we intend to store mostly small snippets of data (tiny XML fragments), I considered VARCHAR the more appropriate type.