Optimization Dataset

The OptimizationDataset collection represents the results of geometry optimization calculations performed on a series of Molecules. The OptimizationDataset uses metadata specifications via Optimization Specification and QCSpecification classes to manage parameters of the geometry optimizer and the underlying gradient calculation, respectively.

The existing OptimizationDataset collections can be listed or selectively returned through FractalClient.list_collections("OptimizationDataset") and FractalClient.get_collection("OptimizationDataset", name), respectively.

Querying the Data

All available optimization data specifications can be listed via

>>> ds.list_specifications()

function. In order to show the status of the optimization calculations for a given set of specifications, one can use:

>>> ds.status(["default"])

For each Molecule, the number of steps in a geometry optimization procedure can be queried through calling:

>>> ds.counts()

function. Individual OptimizationRecords can be obtained using:

>>> ds.get_record(name="CCO-0", specification="default")

Statistics and Visualization

The trajectory of energy change during the course of geometry optimization can be plotted by adopting qcportal.models.OptimizationRecord.show_history() function.

Creating the Datasets

A new collection object for OptimizationDataset can be created using

>>> ds = ptl.collections.OptimizationDataset(name = "QM8-T", client=client)

Specific set of parameters for geometry optimization can be defined and added to the dataset as follows:

>>> spec = {'name': 'default',
>>>         'description': 'Geometric + Psi4/B3LYP-D3/Def2-SVP.',
>>>         'optimization_spec': {'program': 'geometric', 'keywords': None},
>>>         'qc_spec': {'driver': 'gradient',
>>>         'method': 'b3lyp-d3',
>>>         'basis': 'def2-svp',
>>>         'keywords': None,
>>>         'program': 'psi4'}}

>>>  ds.add_specification(**spec)

>>>  ds.save()

Molecules can be added to the OptimizationDataset as new entries for optimization via:

ds.add_entry(name, molecule)

When adding multiple entries of molecules, saving the dataset onto the server should be postponed until after all molecules are added:

>>> for name, molecule in new_entries:
>>>     ds.add_entry(name, molecule, save=False)

>>> ds.save()

Computational Tasks

In order to run a geometry optimization calculation based on a particular set of parameters (the default set in this case), one can adopt the

>>> ds.compute(specification="default", tag="optional_tag")

function from OptimizationDataset class.


class qcportal.collections.OptimizationDataset(name: str, client: FractalClient = None, **kwargs)[source]
class DataModel(*, id: str = 'local', name: str, collection: str, provenance: Dict[str, str] = {}, tags: List[str] = [], tagline: str = None, description: str = None, group: str = 'default', visibility: bool = True, view_url_hdf5: str = None, view_url_plaintext: str = None, view_metadata: Dict[str, str] = None, view_available: bool = False, metadata: Dict[str, Any] = {}, records: Dict[str, qcportal.collections.optimization_dataset.OptEntry] = {}, history: Set[str] = {}, specs: Dict[str, qcportal.collections.optimization_dataset.OptEntrySpecification] = {})[source]
  • id (str, Default: local)

  • name (str)

  • collection (str)

  • provenance (name=’provenance’ type=Mapping[str, str] required=False default={}, Default: {})

  • tags (List[str], Default: [])

  • tagline (str, Optional)

  • description (str, Optional)

  • group (str, Default: default)

  • visibility (bool, Default: True)

  • view_url_hdf5 (str, Optional)

  • view_url_plaintext (str, Optional)

  • view_metadata (name=’view_metadata’ type=Optional[Mapping[str, str]] required=False default=None, Optional)

  • view_available (bool, Default: False)

  • metadata (Dict[str, Any], Default: {})

  • records (OptEntry, Default: {})

  • history (Set[str], Default: set())

  • specs (OptEntrySpecification, Default: {})

class Config[source]
compare(other: Union[qcelemental.models.basemodels.ProtoModel, pydantic.main.BaseModel], **kwargs)bool

Compares the current object to the provided object recursively.

  • other (Model) – The model to compare to.

  • **kwargs – Additional kwargs to pass to qcelemental.compare_recursive.


True if the objects match.

Return type


classmethod construct(_fields_set: Optional[SetStr] = None, **values: Any)Model

Creates a new model setting __dict__ and __fields_set__ from trusted or pre-validated data. Default values are respected, but no other validation is performed.

copy(*, include: Union[AbstractSetIntStr, MappingIntStrAny] = None, exclude: Union[AbstractSetIntStr, MappingIntStrAny] = None, update: DictStrAny = None, deep: bool = False)Model

Duplicate a model, optionally choose which fields to include, exclude and change.

  • include – fields to include in new model

  • exclude – fields to exclude from new model, as with values this takes precedence over include

  • update – values to change/add in the new model. Note: the data is not validated before creating the new model: you should trust this data

  • deep – set to True to make a deep copy of the model


new model instance

dict(**kwargs)Dict[str, Any]

Generate a dictionary representation of the model, optionally specifying which fields to include or exclude.

property fields
classmethod from_orm(obj: Any)Model

Generate a JSON representation of the model, include and exclude arguments as per dict().

encoder is an optional function to supply as default to json.dumps(), other arguments as per json.dumps().

classmethod parse_file(path: Union[str, pathlib.Path], *, encoding: Optional[str] = None)qcelemental.models.basemodels.ProtoModel

Parses a file into a Model object.

  • path (Union[str, Path]) – The path to the file.

  • encoding (str, optional) – The type of the files, available types are: {‘json’, ‘msgpack’, ‘pickle’}. Attempts to automatically infer the file type from the file extension if None.


The requested model from a serialized format.

Return type


classmethod parse_obj(obj: Any)Model
classmethod parse_raw(data: Union[bytes, str], *, encoding: Optional[str] = None)qcelemental.models.basemodels.ProtoModel

Parses raw string or bytes into a Model object.

  • data (Union[bytes, str]) – A serialized data blob to be deserialized into a Model.

  • encoding (str, optional) – The type of the serialized array, available types are: {‘json’, ‘json-ext’, ‘msgpack-ext’, ‘pickle’}


The requested model from a serialized format.

Return type


classmethod schema(by_alias: bool = True, ref_template: unicode = '#/definitions/{model}')DictStrAny
classmethod schema_json(*, by_alias: bool = True, ref_template: unicode = '#/definitions/{model}', **dumps_kwargs: Any)unicode
serialize(encoding: str, *, include: Optional[Set[str]] = None, exclude: Optional[Set[str]] = None, exclude_unset: Optional[bool] = None, exclude_defaults: Optional[bool] = None, exclude_none: Optional[bool] = None)Union[bytes, str]

Generates a serialized representation of the model

  • encoding (str) – The serialization type, available types are: {‘json’, ‘json-ext’, ‘msgpack-ext’}

  • include (Optional[Set[str]], optional) – Fields to be included in the serialization.

  • exclude (Optional[Set[str]], optional) – Fields to be excluded in the serialization.

  • exclude_unset (Optional[bool], optional) – If True, skips fields that have default values provided.

  • exclude_defaults (Optional[bool], optional) – If True, skips fields that have set or defaulted values equal to the default.

  • exclude_none (Optional[bool], optional) – If True, skips fields that have value None.


The serialized model.

Return type

Union[bytes, str]

to_string(pretty: bool = False)unicode
classmethod update_forward_refs(**localns: Any)None

Try to update ForwardRefs on fields based on this Model, globalns and localns.

classmethod validate(value: Any)Model
add_entry(name: str, initial_molecule: Molecule, additional_keywords: Optional[Dict[str, Any]] = None, attributes: Optional[Dict[str, Any]] = None, save: bool = True)None[source]
  • name (str) – The name of the entry, will be used for the index

  • initial_molecule (Molecule) – The list of starting Molecules for the Optimization

  • additional_keywords (Dict[str, Any], optional) – Additional keywords to add to the optimization run

  • attributes (Dict[str, Any], optional) – Additional attributes and descriptions for the entry

  • save (bool, optional) – If true, saves the collection after adding the entry. If this is False be careful to call save after all entries are added, otherwise data pointers may be lost.

add_specification(name: str, optimization_spec: qcportal.models.common_models.OptimizationSpecification, qc_spec: qcportal.models.common_models.QCSpecification, description: Optional[str] = None, protocols: Optional[Dict[str, Any]] = None, overwrite=False)None[source]
  • name (str) – The name of the specification

  • optimization_spec (OptimizationSpecification) – A full optimization specification for Optimization

  • qc_spec (QCSpecification) – A full quantum chemistry specification for Optimization

  • description (str, optional) – A short text description of the specification

  • protocols (Optional[Dict[str, Any]], optional) – Protocols for this specification.

  • overwrite (bool, optional) – Overwrite existing specification names

compute(specification: str, subset: Optional[Set[str]] = None, tag: Optional[str] = None, priority: Optional[str] = None)int

Computes a specification for all entries in the dataset.

  • specification (str) – The specification name.

  • subset (Set[str], optional) – Computes only a subset of the dataset.

  • tag (Optional[str], optional) – The queue tag to use when submitting compute requests.

  • priority (Optional[str], optional) – The priority of the jobs low, medium, or high.


The number of submitted computations

Return type


counts(entries: Optional[Union[List[str], str]] = None, specs: Optional[Union[List[str], str]] = None)pandas.core.frame.DataFrame[source]

Counts the number of optimization or gradient evaluations associated with the Optimizations.

  • entries (Union[str, List[str]]) – The entries to query for

  • specs (Optional[Union[str, List[str]]], optional) – The specifications to query for

  • count_gradients (bool, optional) – If True, counts the total number of gradient calls. Warning! This can be slow for large datasets.


The queried counts.

Return type


classmethod from_json(data: Dict[str, Any], client: FractalClient = None)Collection

Creates a new class from a JSON blob

  • data (Dict[str, Any]) – The JSON blob to create a new class from.

  • client (FractalClient, optional) – A FractalClient connected to a server


A constructed collection.

Return type


classmethod from_server(client: FractalClient, name: str)Collection

Creates a new class from a server

  • client (FractalClient) – A FractalClient connected to a server

  • name (str) – The name of the collection to pull from.


A constructed collection.

Return type


get_entry(name: str)Any

Obtains a record from the Dataset


name (str) – The record name to pull from.


The requested record

Return type


get_record(name: str, specification: str)Any

Pulls an individual computational record of the requested name and column.

  • name (str) – The index name to pull the record of.

  • specification (str) – The name of specification to pull the record of.


The requested Record

Return type


get_specification(name: str)Any

name (str) – The name of the specification


The requested specification.

Return type


list_specifications(description=True)Union[List[str], pandas.core.frame.DataFrame]

Lists all available specifications


description (bool, optional) – If True returns a DataFrame with Description


A list of known specification names.

Return type

Union[List[str], ‘DataFrame’]

query(specification: str, force: bool = False)pandas.core.series.Series

Queries a given specification from the server

  • specification (str) – The specification name to query

  • force (bool, optional) – Force a fresh query if the specification already exists.


Records collected from the server

Return type


save(client: Optional[FractalClient] = None)ObjectId

Uploads the overall structure of the Collection (indices, options, new molecules, etc) to the server.


client (FractalClient, optional) – A FractalClient connected to a server to upload to


The ObjectId of the saved collection.

Return type


status(specs: Optional[Union[List[str], str]] = None, collapse: bool = True, status: Optional[str] = None, detail: bool = False)pandas.core.frame.DataFrame

Returns the status of all current specifications.

  • collapse (bool, optional) – Collapse the status into summaries per specification or not.

  • status (Optional[str], optional) – If not None, only returns results that match the provided status.

  • detail (bool, optional) – Shows a detailed description of the current status of incomplete jobs.


A DataFrame of all known statuses

Return type


to_json(filename: Optional[str] = None)

If a filename is provided, dumps the file to disk. Otherwise returns a copy of the current data.


filename (str, Optional, Default: None) – The filename to drop the data to.


ret – A JSON representation of the Collection

Return type