Optimization Dataset

The OptimizationDataset collection represents geometry optimizations performed on a series of Molecules. OptimizationDataset use specifications to manage parameters of the geometry optimizer and underlying gradient calculation.

Existing OptimizationDataset can be listed with FractalClient.list_collections("OptimizationDataset") and obtained with FractalClient.get_collection("OptimizationDataset", name).


List specifications:


Show status of calculations for a given specification:


The number of geometry steps for each molecule can be shown:


Individual OptimizationRecords can be extracted:

ds.get_record(name="CCO-0", specification="default")


Create a new collection:

ds = ptl.collections.OptimizationDataset(name = "QM8-T", client=client)

Provide a specification:

spec = {'name': 'default',
        'description': 'Geometric + Psi4/B3LYP-D3/Def2-SVP.',
        'optimization_spec': {'program': 'geometric', 'keywords': None},
        'qc_spec': {'driver': 'gradient',
        'method': 'b3lyp-d3',
        'basis': 'def2-svp',
        'keywords': None,
        'program': 'psi4'}}

Add molecules to optimize:

ds.add_entry(name, molecule)

If adding molecules in batches, you may wish to defer saving the dataset to the server until all molecules are added:

for name, molecule in new_entries:
    ds.add_entry(name, molecule, save=False)


ds.compute(specification="default", tag="optional_tag")


class qcportal.collections.OptimizationDataset(name: str, client: FractalClient = None, **kwargs)[source]
class DataModel(*, id: str = 'local', name: str, collection: str, provenance: Dict[str, str] = {}, tags: List[str] = [], tagline: str = None, description: str = None, group: str = 'default', visibility: bool = True, view_url_hdf5: str = None, view_url_plaintext: str = None, view_metadata: Dict[str, str] = None, view_available: bool = False, metadata: Dict[str, Any] = {}, records: Dict[str, qcportal.collections.optimization_dataset.OptEntry] = {}, history: Set[str] = {}, specs: Dict[str, qcportal.collections.optimization_dataset.OptEntrySpecification] = {})[source]
  • id (str, Default: local)

  • name (str)

  • collection (str)

  • provenance (name=’provenance’ type=Mapping[str, str] required=False default={}, Default: {})

  • tags (List[str], Default: [])

  • tagline (str, Optional)

  • description (str, Optional)

  • group (str, Default: default)

  • visibility (bool, Default: True)

  • view_url_hdf5 (str, Optional)

  • view_url_plaintext (str, Optional)

  • view_metadata (name=’view_metadata’ type=Optional[Mapping[str, str]] required=False default=None, Optional)

  • view_available (bool, Default: False)

  • metadata (Dict[str, Any], Default: {})

  • records (OptEntry, Default: {})

  • history (Set[str], Default: set())

  • specs (OptEntrySpecification, Default: {})

class Config[source]
compare(other: Union[ProtoModel, pydantic.main.BaseModel], **kwargs) → bool

Compares the current object to the provided object recursively.

  • other (Model) – The model to compare to.

  • **kwargs – Additional kwargs to pass to qcelemental.compare_recursive.


True if the objects match.

Return type


classmethod construct(_fields_set: Optional[SetStr] = None, **values: Any) → Model

Creates a new model setting __dict__ and __fields_set__ from trusted or pre-validated data. Default values are respected, but no other validation is performed.

copy(*, include: Union[AbstractSetIntStr, MappingIntStrAny] = None, exclude: Union[AbstractSetIntStr, MappingIntStrAny] = None, update: DictStrAny = None, deep: bool = False) → Model

Duplicate a model, optionally choose which fields to include, exclude and change.

  • include – fields to include in new model

  • exclude – fields to exclude from new model, as with values this takes precedence over include

  • update – values to change/add in the new model. Note: the data is not validated before creating the new model: you should trust this data

  • deep – set to True to make a deep copy of the model


new model instance

dict(**kwargs) → Dict[str, Any]

Generate a dictionary representation of the model, optionally specifying which fields to include or exclude.

property fields
classmethod from_orm(obj: Any) → Model

Generate a JSON representation of the model, include and exclude arguments as per dict().

encoder is an optional function to supply as default to json.dumps(), other arguments as per json.dumps().

classmethod parse_file(path: Union[str, pathlib.Path], *, encoding: str = None) → qcelemental.models.basemodels.ProtoModel

Parses a file into a Model object.

  • path (Union[str, Path]) – The path to the file.

  • encoding (str, optional) – The type of the files, available types are: {‘json’, ‘msgpack’, ‘pickle’}. Attempts to automatically infer the file type from the file extension if None.


The requested model from a serialized format.

Return type


classmethod parse_obj(obj: Any) → Model
classmethod parse_raw(data: Union[bytes, str], *, encoding: str = None) → qcelemental.models.basemodels.ProtoModel

Parses raw string or bytes into a Model object.

  • data (Union[bytes, str]) – A serialized data blob to be deserialized into a Model.

  • encoding (str, optional) – The type of the serialized array, available types are: {‘json’, ‘json-ext’, ‘msgpack-ext’, ‘pickle’}


The requested model from a serialized format.

Return type


classmethod schema(by_alias: bool = True) → DictStrAny
classmethod schema_json(*, by_alias: bool = True, **dumps_kwargs: Any) → unicode
serialize(encoding: str, *, include: Optional[Set[str]] = None, exclude: Optional[Set[str]] = None, exclude_unset: Optional[bool] = None) → Union[bytes, str]

Generates a serialized representation of the model

  • encoding (str) – The serialization type, available types are: {‘json’, ‘json-ext’, ‘msgpack-ext’}

  • include (Optional[Set[str]], optional) – Fields to be included in the serialization.

  • exclude (Optional[Set[str]], optional) – Fields to be excluded in the serialization.

  • exclude_unset (Optional[bool], optional) – If True, skips fields that have default values provided.


The serialized model.

Return type

Union[bytes, str]

to_string(pretty: bool = False) → unicode
classmethod update_forward_refs(**localns: Any) → None

Try to update ForwardRefs on fields based on this Model, globalns and localns.

classmethod validate(value: Any) → Model
add_entry(name: str, initial_molecule: Molecule, additional_keywords: Optional[Dict[str, Any]] = None, attributes: Optional[Dict[str, Any]] = None, save: bool = True) → None[source]
  • name (str) – The name of the entry, will be used for the index

  • initial_molecule (Molecule) – The list of starting Molecules for the Optimization

  • additional_keywords (Dict[str, Any], optional) – Additional keywords to add to the optimization run

  • attributes (Dict[str, Any], optional) – Additional attributes and descriptions for the entry

  • save (bool, optional) – If true, saves the collection after adding the entry. If this is False be careful to call save after all entries are added, otherwise data pointers may be lost.

add_specification(name: str, optimization_spec: qcportal.models.common_models.OptimizationSpecification, qc_spec: qcportal.models.common_models.QCSpecification, description: Optional[str] = None, protocols: Optional[Dict[str, Any]] = None, overwrite=False) → None[source]
  • name (str) – The name of the specification

  • optimization_spec (OptimizationSpecification) – A full optimization specification for Optimization

  • qc_spec (QCSpecification) – A full quantum chemistry specification for Optimization

  • description (str, optional) – A short text description of the specification

  • protocols (Optional[Dict[str, Any]], optional) – Protocols for this specification.

  • overwrite (bool, optional) – Overwrite existing specification names

compute(specification: str, subset: Set[str] = None, tag: Optional[str] = None, priority: Optional[str] = None) → int

Computes a specification for all entries in the dataset.

  • specification (str) – The specification name.

  • subset (Set[str], optional) – Computes only a subset of the dataset.

  • tag (Optional[str], optional) – The queue tag to use when submitting compute requests.

  • priority (Optional[str], optional) – The priority of the jobs low, medium, or high.


The number of submitted computations

Return type


counts(entries: Optional[Union[List[str], str]] = None, specs: Optional[Union[List[str], str]] = None) → pandas.core.frame.DataFrame[source]

Counts the number of optimization or gradient evaluations associated with the Optimizations.

  • entries (Union[str, List[str]]) – The entries to query for

  • specs (Optional[Union[str, List[str]]], optional) – The specifications to query for

  • count_gradients (bool, optional) – If True, counts the total number of gradient calls. Warning! This can be slow for large datasets.


The queried counts.

Return type


classmethod from_json(data: Dict[str, Any], client: FractalClient = None) → Collection

Creates a new class from a JSON blob

  • data (Dict[str, Any]) – The JSON blob to create a new class from.

  • client (FractalClient, optional) – A FractalClient connected to a server


A constructed collection.

Return type


classmethod from_server(client: FractalClient, name: str) → Collection

Creates a new class from a server

  • client (FractalClient) – A FractalClient connected to a server

  • name (str) – The name of the collection to pull from.


A constructed collection.

Return type


get_entry(name: str) → Any

Obtains a record from the Dataset


name (str) – The record name to pull from.


The requested record

Return type


get_record(name: str, specification: str) → Any

Pulls an individual computational record of the requested name and column.

  • name (str) – The index name to pull the record of.

  • specification (str) – The name of specification to pull the record of.


The requested Record

Return type


get_specification(name: str) → Any

name (str) – The name of the specification


The requested specification.

Return type


list_specifications(description=True) → Union[List[str], pandas.core.frame.DataFrame]

Lists all available specifications


description (bool, optional) – If True returns a DataFrame with Description


A list of known specification names.

Return type

Union[List[str], ‘DataFrame’]

query(specification: str, force: bool = False) → pandas.core.series.Series

Queries a given specification from the server

  • specification (str) – The specification name to query

  • force (bool, optional) – Force a fresh query if the specification already exists.


Records collected from the server

Return type


save(client: Optional[FractalClient] = None) → ObjectId

Uploads the overall structure of the Collection (indices, options, new molecules, etc) to the server.


client (FractalClient, optional) – A FractalClient connected to a server to upload to


The ObjectId of the saved collection.

Return type


status(specs: Union[str, List[str]] = None, collapse: bool = True, status: Optional[str] = None, detail: bool = False) → pandas.core.frame.DataFrame

Returns the status of all current specifications.

  • collapse (bool, optional) – Collapse the status into summaries per specification or not.

  • status (Optional[str], optional) – If not None, only returns results that match the provided status.

  • detail (bool, optional) – Shows a detailed description of the current status of incomplete jobs.


A DataFrame of all known statuses

Return type


to_json(filename: Optional[str] = None)

If a filename is provided, dumps the file to disk. Otherwise returns a copy of the current data.


filename (str, Optional, Default: None) – The filename to drop the data to.


ret – A JSON representation of the Collection

Return type