Databases have become ubiquitous. In addition to mainframe and client-server systems, relational database systems are utilized by embedded applications, on smartphones, within web browsers – just about everywhere. The article below provides some background regarding the necessary properties of “ubiquitous” database management systems, and also provided the title of this blog.
This blog entry is copyright by Sybase, Inc., an SAP company, and first appeared on Glenn Paulley’s Sybase blog (http://iablog.sybase.com/paulley) on July 10, 2010. It is reprinted here with permission.
SQL Anywhere as a ubiquitous DBMS
In the latest (December 2009) edition of ACM SIGMOD Record that arrived on my desk on Wednesday, Kyu-Young Whang et al. survey  a selection of research prototypes and commercial DBMS products that fit the notion of a ubiquitous database. Here’s the paper’s abstract:
Advancement in mobile computing technologies has prompted strong needs for database systems that can be used in small devices such as sensors, cellular phones, PDAs, car navigators, and Ultra Mobile PCs (UMPCs). We term the database systems that are customizable for small computing devices as Ubiquitous Database Management Systems (UDBMSs). In this paper, we first review the requirements of the UDBMS. The requirements identified include lightweight DBMSs, selective convergence, flash-optimized storage systems, data synchronization, support of unstructured/semi-structured data, complex database operations, self-management, and security. Next, we review existing systems and research prototypes. We review the functionality of UDBMSs including the footprint size, support of standard SQL, transaction management, concurrency control, recovery, indexing, and access control. We then review the supportability of the requirements by those UDBMSs surveyed. We finally present research issues related to the UDBMS.
Based on previous surveys by Nori  and Bernard et al. , Whang et al. identify the following requirements of UDBMS – lightweight operation, selective convergence, support for new storage models (particularly flash), support for complex operations, unstructured or semi-structured data, synchronization, self-management, and security – and then assess the capabilities TinyDB, PicoDBMS, Oracle Berkeley DB, Oracle 10g Lite, IBM DB2 Everyplace, MS SQL Server for CE, and KAIST’s Odysseus DBMS prototype with respect to these requirements.
In this article, I’d like to analyze this set of requirements and outline the support for these requirements in both SQL Anywhere server and our UltraLite DBMS, since both were omitted from the set of products surveyed despite Sybase’s leadership position in embedded databases as acknowledged by Gartner. Somewhat surprisingly, SQLite was also omitted from the list of surveyed products, despite its ubiquitous presence on iPhone, Android, Symbian, and (soon) Blackberry platforms, in addition to desktop tools such as Mozilla Firefox and Skype.
Using a low-clock CPU, small memory, and small storage, a UDBMS needs to support the functionalities [sic] required by applications with acceptable performance. Furthermore, since ubiquitous devices have limited power sources such as batteries, a UDBMS needs to support the functionalities [sic] with low power consumption. Thus, it is important to design and implement a UDBMS as simple as possible [sic] considering the performance of [these] devices.
To be sure, all of the DBMS looked at in this study are not scaled-down versions of enterprise-class DBMS software, but rather are products built from the ground up to run on small footprint devices. SQL Anywhere is unique in that the SQL Anywhere server that runs on Windows Mobile PDAs is nearly identical to the 64-bit version that runs on server-class hardware. A SQL Anywhere server’s footprint on Windows Mobile is 7.8MB, significantly larger than the published size of 2.5MB for MS SQL Server Compact Edition, except that SQL Server CE also requires the Visual C++ runtime library whose minimum download size is 4.3MB. Unlike the other products in the survey, SQL Anywhere is full-featured on the Windows Mobile platform, and supports stored procedures, triggers, OLAP functionality, table functions, self-healing statistics management, full text search, materialized views, XML and spatial data.
UltraLite, on the other hand, is much more lightweight, though the reduced footprint comes at the expense of functionality. Applications can link UltraLite statically, call it as a DLL, or connect to an UltraLite server. The Version 12 UltraLite static library is 4.2MB, though the static library contains symbol information required for linking which significantly adds to its size. A fairly plain (but complete) UltraLite application is as small as 670K – only the functionality required by the application is linked in. The UltraLite DLL and server are each approximately 800K; with the server, the client application requires an additional 140K.
Commercial UDBMSs have shrunken the [sic] footprint size and memory usage by simplifying the functionalities [sic] of server DBMSs, but commercial vendors have not reported the techniques they adopted.
References [4-6] below outline some of the technologies used within SQL Anywhere to lower its runtime memory footprint. By optimizing over left-deep trees, SQL Anywhere’s branch-and-bound join enumeration algorithm – a re-developed, extended version of the one described in  – can utilize the stack for most of the temporary data structures required during the optimization process, significantly reducing the amount of memory required for optimization. As an example, a 100-way join query against a small TPC-H database can be optimized and executed by SQL Anywhere on a Dell Axim handheld device, running Windows Mobile 5, with as little as 3 MB of buffer pool, with only 1 MB of memory needed for optimization. 
As the ubiquitous environment evolves, many advanced applications that have been used in servers will also be required of ubiquitous devices. However, even high performance devices such as PDAs and UMPCs still lack computing power to run a fully-fledged DBMS that supports as many functionalities as server DBMSs do. Thus, selective convergence will become a very important technique since it enables lightweight DBMSs by selectively composing the modules of a fully-fledged DBMS. We expect research on designing and implementing a new DBMS architecture that supports selective convergence will soon become active.
“Selective convergence” or “componentization”  is simply the idea including only the necessary database components with a deployed application. As such, to me this is not a research issue but an engineering one. Nonetheless, the benefits of selective convergence are subject to tradeoffs. Firstly, a server architecture that permits a la carte components is much more difficult to test. Secondly, there are often dependencies between server code and the data store which can make componetization much more complex, particularly with server upgrades or new application requirements.
On Windows Mobile, SQL Anywhere disables support for some memory-intensive query execution operators, along with multidatabase access. It does not support componentization in the sense outlined above. UltraLite’s static library supports componentization in that applications link only UltraLite code elements required for their application.
Flash-optimized storage systems
Cache management and query execution techniques that exploit the characteristics of flash storage are fertile areas for additional research. Though SQL Anywhere does not currently contain implementations of flash-optimized query processing algorithms, SQL Anywhere does offer the ability for DBAs to customize the server’s cost model via a calibration process. That calibrated model can then be deployed (embedded) with any database.
An issue with a flash-optimized storage format is its incompatibility with other OS file systems. SQL Anywhere is unique in the industry in that SQL Anywhere’s database format is identical on every supported platform, including Mac OS/X, Sun, AIX, Linux, Windows, and Windows Mobile – an image copy of a database is a trivial file copy, and once copied the identical database can be started on any platform. This feature makes database deployments significantly easier and is critical for problem determination when an issue occurs on a deployed device.
Main research issues on data synchronization are (1) efficiently maintaining data synchronization between a huge number of ubiquitous devices and the server, (2) resolving conflicts when there are different versions of the same data among ubiquitous devices and the server, (3) recovering from crash and restarting data synchronization when system failure occurs during data synchronization.
In my view these engineering issues are solved problems, although quantifying “huge” is important. In 2009, UltraLite was deployed on over 140,000 Windows Mobile handheld devices for the 2010 US Census, using iAnywhere’s MobiLink technology to synchronize with an Oracle consolidated server using MobiLink’s conflict resolution mechanisms. Daudjee and Salem [7,8], amongst other researchers, have looked at correctness issues with database replication, though not at weaker isolation levels that are typically used in practice.
Since ubiquitous devices (including PDAs and UMPCs) lack computing power to run a fully-fledged DBMS that supports unstructured/semi-structured data, research on developing a lightweight version of the unstructured/semi-structured data management module considering the specification and performance of the ubiquitous devices needs to be conducted. For example, in a fully-fledged DBMS, one could implement many existing query processing methods and combine them under a cost-based optimization framework so as to select the best query evaluation plan. However, since ubiquitous devices have small memory and storage, it may not be feasible to include many query processing methods and the optimization module in the UDBMS.
On ubiquitous platforms that typically ship with sufficient memory – such as virtually all Windows Mobile devices – cost-based query optimization is certainly feasible, as has been amply demonstrated by SQL Anywhere. In our experience, query and schema complexity is not reduced on with sophisticated business applications on hand-held devices; queries with join degrees between 20 and 30 (or more) are commonplace. Indeed, we are aware of SQL Anywhere customers who are utilizing materialized views to overcome the lack of CPU horsepower on handheld devices, which can cause performance issues with complex queries.
Unlike the other commercial products surveyed in this paper, the latest release of SQL Anywhere, SQL Anywhere 12, supports XML, full text search, and spatial data on every platform including Windows Mobile handhelds.
Complex Operations for Advanced Applications
Advanced database applications require complex operations such as data mining that cannot be implemented by [sic] SQL. As ubiquitous devices are rapidly evolving, many advanced applications that have been used in servers will also be required of ubiquitous devices. Thus, a UDBMS should be able to support complex operations.
Unfortunately, Whang et al. do not outline the data mining (or other complex operations) functionality that they have in mind. I will point out, however, that SQL Anywhere is by far the most feature-rich of the UDBMS surveyed in this paper. SQL Anywhere supports full-text search, integrated into SQL, along with OLAP functionality including
WINDOW functions and extended
GROUP BY processing as outlined in the ISO SQL/2008 standard.
Unlike in traditional DBMSs, in a UDBMS, there can be no database administrator (DBA) to manage the database. Thus, a UDBMS needs to support self-management functionalities [sic]. That is, it needs to automatically perform operations like backup, restore, recovery, indexing, and tuning.
Self-management is one of the key technologies offered by SQL Anywhere. As described in [4-5] and , SQL Anywhere includes automatic cache sizing, self-managing and self-healing statistics management, adaptive query execution strategies, active database functionality (events) and dynamic multiprogramming level adjustment (not enabled on Windows Mobile), in addition to DBA tools such as the SQL Anywhere index consultant, stored procedure debugger, and Application Profiling.
Since ubiquitous devices often contain personal data such as banking and healthcare data, a UDBMS needs to ensure the [sic] data security by providing access control policies.
Both SQL Anywhere and UltraLite support RSA and ECC encryption for the database file. RSA and ECC encryption for client/server communication is supported as well, so that MobiLink synchronization to a consolidated database can also be strongly encrypted. SQL Anywhere’s access control policies are identical on the Windows Mobile platform as they are for all other platforms.
The “ubiquitous DBMS” feature set outlined in this paper describes many of the features built into both SQL Anywhere and UltraLite, particularly in the areas of self-management, footprint, security, complex processing, synchronization, and unstructured/semi-structured data support. In terms of research issues, in my view it is query processing algorithms and buffer pool management techniques that exploit flash memory which are of the greatest near-term importance.
 Kyu-Young Whang, Il-Yeol Song, Taek-Yoon Kim, and Ki-Hoon Lee (December 2009). The Ubiquitous DBMS. ACM SIGMOD Record 38(4), pp. 14-22.
 Guy Bernard, Jalel Ben-Othman, Luc Bouganim, et al. (June 2004). Mobile Databases: a Selection of Open Issues and Research Directions. ACM SIGMOD Record 33(2), pp. 78-83.
 Ivan T. Bowman, Peter Bumbulis, Dan Farrar, Anil K. Goel, Brendan Lucier, Anisoara Nica, G. N. Paulley, John Smirnios, Matthew Young-Lai (September 2007). SQL Anywhere: An Embeddable DBMS. IEEE Data Engineering Bulletin 30(3), pp. 29-36.
 Ivan T. Bowman and G.N. Paulley (February 2000). Join Enumeration in a Memory-Constrained Environment. In Proceedings, 16th International Conference on Data Engineering, pp. 645-654.
 I. T. Bowman et al. (April 2007). SQL Anywhere: A Holistic Approach to Database Self-Management. In Proceedings, 2nd International IEEE Workshop on Self-Managing Database Systems, Istanbul, Turkey.
 Khuzaima Daudjee and Kenneth Salem (September 2006). Lazy database replication with snapshot isolation. In Proceedings of the International Conference on Very Large Data Bases (VLDB’06), pages 715-726.
 Khuzaima Daudjee and Kenneth Salem (April 2004). Lazy database replication with ordering guarantees. In Proceedings of the IEEE International Conference on Data Engineering (ICDE’04), pages 424-435.
 Mohammed Abouzour, Kenneth Salem, and Peter Bumbulis (March 2010). Automatic Tuning of the Multiprogramming Level in Sybase SQL Anywhere. In Proceedings, 5th International Workshop on Self-Managing Database Systems, Long Beach, California.