user_guide:howto:polydb_tutorial

This is an old revision of the document!


Introduction to polyDB version 3 (using json format)

Note: This introduction applies to the version of polyDB in the branch feature/polydb-mongo-views of the polymake git. For the released version please see here. Also, this tutorial explains how to use the polyDB from within polymake. If you want to access the data without using polymake please check here.

The polyDB extension provides access to the polyDB database. It comes bundled with polymake, so there is no need to install extra software, except for the MongoDB.pm perl package. If you encounter any errors or problems concerning polyDB, please don't hesitate to ask in the forum.

However, the polymake extension is not necessary to use the data. You can access the data also

  • via the web interface (note: this currently obtains its data from the old version)
  • using the mongo shell directly or via any gui (see below for more details (to be written))

Software developers can also include access to polyDB using any of the many MongoDB interfaces and use the data directly in their programs. The few structural assumptions made in the database that you need to follow in your development are explained below (to be written).

The data in polyDB is stored in collections, which can be organized in nested sections. Access to a certain collection requires the full path to it, i.e. both the (nested) section containing the collection and the collection name. E.g. the collection fo smooth reflexive Fano polytopes up to dimension nine is in the collection SmoothReflexive contained in the Lattice subsection of the section Polytopes. The section separator is a dot (.), so you find the data in

  • section: Polytopes.Lattice
  • collection: SmoothReflexive

Subsections can also be listed in front of the collection, so the following pairs (s,c) of section and collection lead to the same data: (Polytopes,Lattice.SmoothReflexive) and ( ,Polytopes.Lattice.SmoothReflexive).

The command

db_info

gives you a list of collections you can read, ordered by sections. Yo can restrict the search by

  • providing a chain of sections and subsections in the argument section. This can also just be an initial substring, so
    db_info(section=>"Polytopes.La")

    lists all collections within the section Polytopes whose first subsection starts with La

  • given a specific section, you can filter the collections by an initial substring, so
    db_info(section=>"Polytopes.Lattice", collection=>"Smoo")

    lists all collections within the section Polytopes.Lattice whose name starts with Smoo.

If you have access to a not publicly available collection please set the variables $PolyDB::default::db_user and $PolyDB::default::db_pwd first to see your private collections in the above list.

There are various functions available to query data:

  • db_count: to count the number of objects with given properties
  • db_query: to get an array of objects with given properties
  • db_cursor: to obtain a cursor on objects with given properties
  • db_aggregate: to appy complex aggregation pipelines on a collection

The main argument of the first three functions is a MongoDb query hash. You can use the full MongoDB query syntax as decribed here. Note that polymake uses the perl interface, so the query should be given as a perl hash instead of a json document (it mostly suffices to use instead of :). For some perl examples see here. Basic queries for one or more parameter look like

{"N_VERTICES"=>10}
{"DIM"=>5, "N_FACETS"=>7}

Bounds or Ranges can be defined with the operators &gt, &gte, &lt and &lte. For example

{"N_VERTICES"=> { "&gte" => 5, "&lte" => 10 } }

returns documents where the number of vertices is between five and ten (including the boundaries). More operators can be found here. You can also query for elements in arrays either somewhere in the array or at a specific position.

The last function allows to pass an aggregation pipeline as described here (note again that the pipeline needs to be passed as a perl hash instead of a json document).

All queries need the name of the section and collection. You can either pass this via the options section and collection or, in particular if you query the same collection several times in a row, via the variables $PolyDB:“default::db_section_name and $PolyDB:“default::db_collection_name. If you set them via set_custom. this is even persistent over polymake sessions. So a query could look like

db_query({"DIM"=>4}, section=>"Polytopes.Lattice", collection=>"SmoothReflexive");

or

$PolyDB::default::db_section="Polytopes.Lattice";
$PolyDB:"default::db_collection="SmoothReflexive";
db_query({"DIM"=>4});

db_query returns an array of polytopes, db_count just counts the number of objects matching your query, while db_cursor returns a cursor over the objects matching your query. You can iterate over the result via

$c=db_cursor({"DIM"=>4}, section=>"Polytopes.Lattice", collection=>"SmoothReflexive");
while ( $c->has_next() ) { 
  $p=$c->next();
  print $p->N_VERTICES;
}

There are further options to control the return:

  • skip⇒$n: skips the first $n documents
  • limit⇒$n: returns at most $n documents
  • representative⇒1 returns just one document matching the query

There are two pairs of custom variables for access credentials:

  • $PolyDB::default::db_{user,pwd}: For a user that has read access (usually set to polymake/database for all public collections). Set this to your private user if you have been granted access to private collections
  • $PolyDB::default::db_collection_admin_{user,pwd}: For credentials with write access to collections. Note that the first pair is not checked for write access even if the user given there has write access. You must set this pair for write access.

Note that you need write access to write data into polyDB. Also, initiating a new collection needs administrator permissions on the database. Please contact us (directly or via the forum) to obtain write acces to a (new) collection and have us do the basic initialization of a new collection.

Inserting data requires several steps:

  • You need to prepare a json document with meta information about you collection in the following form
    {
       "description" : "Smooth reflexive polytopes with decomposition structure in dimensions 1 to 9",
       "contributor" : "Andreas Paffenholz",
       "maintainer" : "Andreas Paffenholz",
       "creator" : "Andreas Paffenholz",
       "fields" : {
           "ALTSHULER_DET": 1,
           "BALANCED": 1,
           "VERY_AMPLE" : 1
       },
       "polydb_version" : "2.1",
       "packages": {
           "polymake" : {
              "version" : "3.4",
              "type" : "polytope::Polytope<Rational>"
           }
       },
       "uri" : "http://polymake.org/polytopes/paffenholz/www/fano.html”
    }

  • You need to provide a full json schema describing your data. If you have a polymake object with the data you want, then the function create_restrictive_schema can help you with this and provide an initial template. In polyDB the json schema is stored as the schema entry of a document also specifying the section and collection in the entries section and collection. Both JSON schemas and MongoDB use $ as a special character to specify functions. This leads to a clash when you try to store a json schema in MongoDB. Hence, in the schema document we replace $$ with __ for storing and restore this when reading the schema.
  • If you want you collection to be included in the db_info command you need a json document describing you collection in the form

    {
       "collection" : "TOM",
       "section" : [ "DocTropical" ],
       "maintainer" : "Andreas Paffenholz",
       "contributor" : "Silke Horn",
       "author" : "Silke Horn",
       "polydb_version" : "2.1",
       "description" : "All known non-realisable tropical oriented matroids with parameters n=6, d=3 or n=d=4. They were computed using polymake with the tropmat extension and Topcom. You need the extension tropmat for this.",
       "short_description" : "All known non-realisable tropical oriented matroids with parameters n=6, d=3 or n=d=4.",
       "webpage" : [
           {
               "description" : "polymake extension tropmat",
               "address" : "http://solros.de/polymake/tropmat"
           }
       ]
    }

  • If you want to place this also in a new section, then also this (and all new subsections created) need a description document.

Meta information, schema and documentation are stored with the functions

db_set_collection_meta_information($meta); 
db_set_collection_schema($schema); 
db_write_collection_metadata(file=><file>);

where in the first two functions the argument is either a perl hash or the name of a file containing a json document.

Insertion is done with the function db_insert. This function either takes a file, a single polymake big object or an array of such as first argument and writes this data into the collection specified by the options section and collection. As for queries you can set these via custom variables and then don't need to specify them in db_insert. Currently you need to specify use_schema ⇒ 1 in the command to use the meta information and the json schema you provided. Further options are

  • type_information: to specify a different json schema as a perl hash
  • replace: to replace an existing document
  • noinsert: For a dry run of the command.

Starting a new collection

A new collection is started with the command

db_admin_initiate_collection(section=><section>, collection=><collection>);

where you can omit the two options if you have set the section and collection name with the two custom variables $PolyDB::default::db_section_name and $PolyDB::default::db_collection_name before. If the collection should not be public, then also pas the option public⇒false. For a public collection the read access role of the new collection is added to the default role polymakeUser which is granted to every user of polyDB. One can add this later if one wants to build up and test the collection befor making it publicly available.

If this creates new intermediate subsections you should set the section documentation with

db_write_section_metadata(file=><file>);

so that the new collection appears in the list printed by db_info for all users with sufficient permissions.

Note that the first command essentially only creates two new roles in MongoDB, one for read access to the collection (and all sections up to the root) and one for write access to the collection (and only to the collection, not to the sections). The actual collections are only created once the first document is written into the collection. This implies that collections will not be listed with db_info if any of the intermediate sections has no documentation, as then the collection where this is stored is not created.

Any user that has the write access rule (which you can assign with db_admin_add_user_to_collection) can insert, delete and modify documents in the collection, the meta information, the schema and to documentation of the collection.

Access Credentials

There are three pairs of custom variables for access credentials:

  • $PolyDB::default::db_{user,pwd}: For a user that has read access (usually set to polymake/database for all public collections)
  • $PolyDB::default::db_collection_admin_{user,pwd}: For credentials with write access to collections
  • $PolyDB::default::db_admin_{user,pwd}: For MongoDB admin credentials. This is used instead of the collection admin credentials if those are not set.
  • user_guide/howto/polydb_tutorial.1581001606.txt.gz
  • Last modified: 2020/02/06 15:06
  • by pvater