<molecule id="http://www.mol.repo.org/molecule?abc123" ns="myNS">
<molecule id="http://www.mol.repo.org/molecule?abc234" ns="myNS">
<descriptor id="http://www.cdk.sf.net/descriptors/xlogp:implementation1" name="XlogP">
<parameter key="cutoff" value="10">
<descriptor id="http://www.cdk.sf.net/descriptors/ZagrebIndex:02"/ name="Zagreb Index">
The implementation would be an automatic build that calculates all descriptors for each molecules when the qsar.xml changes, but only for deltas (i.e. partial build, do not build already built). This means, add a new descriptor to the qsar.xml and on save it will be calculated for all molecules. This would be done on the background and produce a descriptor matrix (dataset.csv). Any plots of this matrix would also be updated.
I have started to implement this in Bioclipse2 and intend to also create a public repository for storing these QSAR setups.
So, what is missing? Having unique ID's following the REST architecture for descriptors and molecules would make it a standards candidate for QSAR data matrices setup. I shall explore this with the CDK people. Comments on the project's design and implementation are very welcome.