Perturbation-theory machine learning for mood disorders: virtual design of dual inhibitors of NET and SERT proteins

Kleandrova, Valeria V.; Cordeiro, M. Natália D. S.; Speck-Planche, Alejandro

doi:10.1186/s13065-024-01376-z

Research
Open access
Published: 02 January 2025

Perturbation-theory machine learning for mood disorders: virtual design of dual inhibitors of NET and SERT proteins

BMC Chemistry volume 19, Article number: 2 (2025) Cite this article

974 Accesses
1 Citations
1 Altmetric
Metrics details

This article has been updated

Abstract

Mood disorders affect the daily lives of millions of people worldwide. The search for more efficient therapies for mood disorders remains an active field of research. In silico approaches can accelerate the search for inhibitors against protein targets related to mood disorders. Here, we developed the first model perturbation-theory machine learning model based on a multiplayer perceptron network (PTML-MLP) for the simultaneous prediction and design of virtual dual-target inhibitors against two proteins associated with mood disorders, namely norepinephrine and serotonin transporters (NET and SERT, respectively). The PTML-MLP model had an accuracy of around 80%. From a chemical point of view, the PTML-MLP model could accurately identify both single- and dual-target inhibitors present in the dataset used to build it. Through the application of the fragment-based topological design (FBTD) approach, the molecular descriptors (multi-label graph-based indices) present in the PTML-MLP model were physicochemically and structurally interpreted. Such interpretations enabled (a) the extraction of different molecular fragments with a positive influence on the enhancement of the dual-target activity and (b) the design of four new drug-like molecules by assembling (fusing and/or connecting) several suitable molecular fragments. The designed molecules were predicted by the PTML-MLP model to exhibit dual-target activity against the NET and SERT proteins. These predictions, together with the estimated druglikeness suggest that the designed molecules could be new promising chemotypes to be considered for future synthesis and biological experimentation in the context of treatments for mood disorders.

Peer Review reports

Introduction

Mood disorders constitute complex and debilitating medical conditions, which are marked by disruptions in emotions. These include psychiatric illnesses such as major depressive disorder, hypomania, bipolar disorder, cyclothymia, and many others. Mood disorders are among the leading contributors to the global burden of diseases [1]. They affect around 20% of the population worldwide [2], and, according to a 2018 study reported in the USA, have an associated annual economic cost of more than $US326 billion [3]. However, over the past two decades, it has been demonstrated, that, for many patients with certain mood disorders, one medication may not be enough to tackle the broad range of symptoms they experience. For instance, 30% of patients with major depressive disorder will not remit even after going through separate courses of treatment with multiple antidepressants [4]. Generally speaking, standard treatments for mood disorders (e.g., antidepressants) have proven to be inefficient because the clinical conditions of the patients are either partially improved or remain completely unchanged. The lack of efficacy of the drugs used to treat mood disorders is closely linked to their pharmacology; they act through single mechanisms of action (by inhibiting only one target protein). Consequently, such drugs fall short because of the complex multi-genetic nature of the mood disorders [5]. This limitation, coupled with the serious side effects associated with the current treatments for mood disorders, has led to a reconsideration of the drug development strategies. Thus, there has been a paradigm shift characterized by the emergence of therapies based on dual-target or multi-target inhibitors, i.e., drugs/chemicals able to modulate more than one target involved in the appearance/progression of more or several types of mood disorders.

Before successfully developing an efficacious clinical treatment for mood disorders, it is essential to rationalize and speed up the search for novel and versatile chemicals with improved pharmacological profiles. Therefore, computational methods must be employed because they have become an essential part of most drug discovery campaigns [6]. In this context, many in silico approaches such as molecular docking [7, 8], network pharmacology [9], quantitative structure-activity relationships (QSAR) [10], pharmacophore modeling [11], and molecular dynamics simulation [12] have been used alone or in combination [11,12,13,14,15] for studying and/or discovering molecules that hold potential as future therapeutic solutions for mood disorders. Yet, these approaches, in addition to having the disadvantage of focusing on only one target protein, exhibit other drawbacks such as the use of structurally related series of molecules, the neglection of the impact of the different assay protocols on the activity values, and the lack of sufficiently clear physicochemical and structural interpretations. All these bottlenecks prevent the aforementioned computational methods from being used to design new molecules with desired (dual- or multi-target) pharmacological profiles against mood disorders.

During the last decade, the methodology known as perturbation theory combined with machine learning techniques (PTML) [16,17,18] has been developed to overcome the aforementioned issues present in the modern in silico approaches mentioned above. In doing so, by fusing chemical data with in vitro and/or in vivo biological information, PTML models have been successfully applied to different therapeutic areas such as oncology [19,20,21,22,23], infectious diseases [24,25,26,27,28], and neurosciences [29, 30]. At the same time, through the use of the fragment-based topological design (FBTD) approach, PTML models have been demonstrated to be interpretable [31, 32]. Thus, FBTD has enabled PTML models to be used as tools for the computer-aided design of novel molecules virtually exhibiting the desired biological activity [19, 27, 33].

To the best of our knowledge, there is no report of an in silico approach capable of enabling the discovery of dual- or multi-target inhibitors for the treatment of mood disorders. Such an approach would be beneficial because it would accelerate the early discovery of versatile chemicals with the potential to become clinically relevant treatments against mood disorders. Bearing in mind all the previous ideas, in this work, we have established the theoretical foundations for the rational discovery of new chemicals able to act as dual-target inhibitors against mood disorders. Particularly, we have combined a PTML model based on multilayer perceptron networks (PTML-MLP) with the FBTD approach to enable the in silico design and prediction of dual-target inhibitors against the proteins named norepinephrine and serotonin transporters (NET and SERT, respectively). The reason to choose NET and SERT is, that, from a pharmacological point of view, they are well-validated and attractive targets for the development of pharmacotherapeutic agents against mood disorders [34, 35].

Materials and methods

Dataset and molecular descriptors

All the stages involved in the creation of the PTML-MLP model have been described in detail very recently [28, 33, 36] and are also summarized in Fig. 1. Thus, we will only discuss here specific aspects which are different from previous reports. We retrieved all the chemical and biological data from the public web repository known as the ChEMBL database [37, 38]. The chemical data contained the identifiers and SMILES codes of each molecule while the biological data included the measures and values of inhibitory activity expressed as the half-maximal inhibitory concentration (IC₅₀). Also, the retrieved data contained information on the target proteins and the assay protocols. We deleted all the entries with missing values or activity units, as well as those where the SMILES codes were absent. In the case of a potential duplicate (molecule tested more than one time against the same target protein and by considering the same assay protocol), we kept only the entry containing the lowest IC₅₀ value.

We labeled each molecule as active [IAi(cj) = 1] if it exhibited IC₅₀ ≤ 150 nM against a defined protein; otherwise, the molecule was annotated as inactive [IAi(cj) = − 1]. Notice that the cut-off value mentioned above was chosen because, apart from being a fairly rigorous activity value (low to medium nanomolar range), it allowed the dataset to be kept relatively balanced in terms of the number of active and inactive compounds. We would like to highlight that IAi(cj) was a categorical variable that indicated the inhibitory activity of the ith molecule under the experimental condition cj. At the same time, cj was formed by two elements, namely the target protein (tp) against which the assay was performed (NET or SERT) and the assay protocol (ai) employed to experimentally test the molecule.

We used the SMILES codes (stored in a txt file) as inputs to calculate the topological indices (TIs) known as the bond-based spectral moments. These were weighted by physicochemical properties such as hydrophobicity contributions (Hyd), polar surface areas (Psa), atomic refractivities (Mol), Gasteiger-Marsili charges (Gas), and atomic weights (Ato). We also calculated the atom- and bond-based connectivity indices. In this sense, the MODESLAB v1.5 software was employed for these calculations [39]. We also generated new descriptors NTI by dividing each TI value by NB (number of bonds without considering bond multiplicity). We then divided the dataset into training and test series (approximately 75% for the training set and 25% for the test set). In doing so, we sorted the molecules in increasing order of their IC₅₀ values, assigning the first three molecules with the label “training” while annotating the fourth molecule to belong to the “test” set; this process was repeated for each protein separately. Following, we applied the Box-Jenkins moving average approach in two steps [40], which allowed us to merge the chemical data with the different aspects of the experimental information cj (tp and ai):

$$\:avg\left[TBI\right]cj=\frac{1}{n\left(cj\right)}\times\:\sum\:_{a=1}^{n\left(cj\right)}{TBI}_{a}$$

(1)

In Eq. 1, TBI refers to any TI or NTI descriptor mentioned above. As already explained in reference [40], the term n(cj) is the number of training cases (chemicals) labeled as active by considering the same element of the experimental condition cj (tp or ai). Also, avg[TBI]cj is an average value. We would like to highlight that Eq. 1 was applied to tp and ai separately [40]. In the second step, we calculated the multi-label indices D[TBI]cj for each chemical in the dataset:

$$\:D\left[TBI\right]cj=\left[\frac{TBI-avg\left[TBI\right]cj}{Std\left[TBI\right]}\right]\cdot\:\sqrt{p\left(cj\right)}$$

(2)

In Eq. 2, the terms TBI and avg[TBI]cj have been explained above. On the other hand, Std[TBI] is the standard deviation calculated from the TBI values by considering only the chemicals/cases present in the training set. Also, p(cj) was an a priori probability computed according to recent works [19, 40]. It is important to highlight that the multi-label graph-based indices D[TBI]cj are descriptors that measure how much a query molecule physicochemically and structurally deviates from a group of chemicals annotated as active and assayed under the same experimental conditions as that query molecule. As in the case of Eq. 1, we also applied Eq. 2 to the elements tp and ai separately. More details on the TI, NTI, TBI, avg[TBI]cj, p(cj), and D[TBI]cj values can be found in Supplementary Information 1.

The PTML-MLP model: generation, performance, and applicability domain

Before generating the PTML-MLP model, we employed the IMMAN software (version 1.0) [41], which allowed us to calculate the information index known as the mutual information differential Shannon’s entropy (MI-DSE) [42]; the D[TBI]cj descriptors containing the greatest information contents (and, potentially, displaying the greatest discriminatory powers) were the ones exhibiting the highest MI-DSE values. We also assessed the level of redundancy among all the D[TBI]cj descriptors by computing Pearson’s correlation coefficient (PCC). The program used to compute the PCC values was STATISTICA v13.5.0.17 [43]; only the non-redundant D[TBI]cj descriptors (those complying with the conditions whose − 0.7 < PCC < + 0.7) were kept for further analysis. The artificial neural networks package of this same program was employed to search for the best MLP network (PTML-MLP model). In doing so, due to our experience in working with PTML-MLP models built from datasets of different sizes [19, 27, 44], as well as the fact that there were T = 2742 training cases in this work (see Supplementary Information 1), we used a predefined configuration for different tunable hyperparameters. One of them was the number of input nodes, which was set to be I = 15 (i.e., the fifteen non-redundant D[TBI]cj descriptors exhibiting the highest MI-DSE values). The number of output nodes was automatically set by STATISTICA v13.5.0.17 to be O = 2 because only two classes were predicted (active and inactive). Other tunable hyperparameters whose values were set arbitrarily according to our experience were the number of epochs (500), the minimum and maximum numbers of hidden neurons (with values of 15 and 70 respectively), the number of networks to train (3000), and the number of networks to be retained (250) [19, 27, 44]. Here, logistic, hyperbolic tangent, and exponential were chosen as the activation functions in both the hidden and the output layers. The best MLP network (PTML-MLP model) was the one exhibiting the highest values (in both training and test sets) of local and global sensitivities and specificities, as well as the normalized Matthew correlation coefficient (nMCC) [45]. Besides paying careful attention to all the aforementioned tunable hyperparameters, we wanted to prevent our PTML-MLP model from overfitting the data, and therefore, when searching for the best MLP network, we applied the following equation characterizing the network topology:

$$\:\rho\:=\frac{T}{\left[\right(I+1)H+(H+1\left)O\right]}$$

(3)

In Eq. 3T, I, and O have been defined above. In the case of H, this defines the number of hidden neurons. If ρ > 3, it can be assumed that the MLP network is not overfitting the data [46]. Last, the applicability domain (AD) of the PTML-MLP model was assessed according to a modification of the descriptor space’s approach reported by Speck-Planche and co-workers in recent works [19, 27, 44].

Results and discussion

The PTML-MLP model

The most appropriate PTML-MLP model found by us contained the notation MLP 15-26-2. This means that our PTML-MLP model has 15 nodes in the input layer, which is equivalent to the fifteen D[TBI]cj descriptors present in this model (Table 1). The hidden layer of the PTML-MLP model has 26 neurons; this layer uses hyperbolic tangent as the activation function. The notation also indicates that two categorical values (IAi(cj) = 1 indicating active and IAi(cj) = − 1 for inactive) were predicted by considering one node in the output layer; this layer used the activation function named identity.

Table 1 All the D[TBI]cj descriptors that entered in the PTML-MLP model

Full size table

If now we substitute the values of the parameters of the PTML-MLP model associated with the network topology (i.e., T = 2742, I = 15, H = 26, and O = 2 for training cases, input nodes, hidden neurons, and output nodes, respectively) in Eq. 3, we will obtain a value of ρ = 5.83. Since ρ > 3, it can be concluded that the PTML-MLP model is not overfitting the data [46, 47].

When considering the statistical performance, our PTML-EL-MLP model exhibited a fairly good accuracy of 86.32% in the training set. Also, the same statistical index had a value of 78.57% in the test set. Other statistical indices such as the number of active and inactive (N_Active and N_Inactive, respectively), the number of cases correctly classified as active (CC_Active) and inactive (CC_Active), as well as the sensitivity (Sn), specificity (Sp), and normalized Matthew correlation coefficient (nMCC) values are depicted in Table 2.

Table 2 Global statistical quality and predicted power of the PTML-MLP model as well as other PTML models based on alternative supervised learning techniques

Full size table

Notice that Table 2 offers a comparative analysis of the PTML-MLP model to other PTML models based on three different supervised learning techniques such as linear discriminant analysis (PTML-LDA), support vector machine (PTML-SVM), and random forest (PTML-RF). In this sense, as described in Table 2, all the PTML models were obtained using the same dataset (as indicated by N_Active and N_Inactive). The PTML-LDA model had the worst performance. In the case of PTML-SVM, this model, despite exhibiting a relatively acceptable statistical quality (training set), its predictive power (test set) is greatly reduced since its Sp < 70%. The PTML-RF is the best among the three alternative models in terms of Sn, Sp, and nMCC values. However, the PTML-MLP model outperforms PTML-LDA, PTML-SVM, and PTML-RF, displaying the highest Sn and Sp values in both training and test sets. In the PTML-MLP model, Sn and Sp have values higher than 75%. Also, we would like to highlight that the nMCC can range from 0 (poor prediction) to 1 (ideal performance) while a value of 0.5 is associated with a random predictor. In the case of our PTML-MLP model, the nMCC values, besides being the closest to 1, are also the highest. This indicates that the PTML-MLP model has the strongest convergence between the observed [IAi(cj)] and the predicted [PredIAi(cj)] values of inhibitory activity. Altogether, the PTML-MLP model is better than the other three aforementioned alternative PTML models in terms of both statistical quality and predictive power (the classification results obtained by the PTML-MLP model for each case/chemical in our dataset can be found in Supplementary Material 2).

One of the benefits of our PTML-MLP model is that it can predict the inhibitory activity of chemicals against two proteins (NET and SERT) by considering more than one experimental condition cj. In this sense, Table 3 clearly shows that the PTML-MLP model considers a total of six different cj (which, as explained above, are combinations of the elements tp and ai), with four of them involving NET and the remaining two focused on SERT. This means that the PTML-MLP model can predict activity against any of these proteins in a consensus manner. For instance, a query molecule can be predicted four times against NET. If a molecule is predicted as active in at least 3 of the 4 conditions cj, then, that molecule can be considered as active. A similar line of thinking can be applied to SERT; if a query molecule is predicted as active in at least one of the two experimental conditions cj, then this molecule will be regarded as active.

Table 3 Summary of the diverse experimental conditions reported in the dataset used to create the PTML-MLP model

Full size table

To support the capability of our PTML-MLP to perform the aforementioned consensus predictions, we calculated the values of the statistical metrics symbolized as [Sn(%)]tp, [Sp(%)]tp, [Sn(%)]ai, and [Sp(%)]ai These are the local counterparts of Sn and Sp but depend either on the target protein tp or the assay protocols a.i. Regardless of the training or test sets, the values for all these local metrics were higher than 70% (see Supplementary Material 2), which means that the PTML-MLP model can correctly classify/predict at least 70% of the chemicals (either active or inactive) across different targets and assays. We should say that the only exceptions were the [Sn(%)]ai values for assay “B (cell membrane format)”, which were 62.50% and 50.00% in training and test sets, respectively.

Although our PTML-MLP model can perform consensus predictions, we also assessed the AD, which, as mentioned in subsection 2.2, was carried out according to a variation of the descriptors’ space approach [27, 48]. Thus, we generated 15 local scores of applicability domain (one for each D[TBI]cj descriptor) by comparing each D[TBI]cj value of any query molecule with the corresponding maximum and minimum D[TBI]cj values. If the D[TBI]cj value of a molecule was inside the boundary formed by the maximum and minimum D[TBI]cj values, the local score took the value of one; otherwise, the local score took the value of zero. For each molecule present in the dataset used to build the PTML-MLP model, this procedure was applied to each of the D[TBI]cj descriptors. Then, we calculated the total score of the applicability domain (TSAD) as the summation of local scores. Because 15 D[TBI]cj descriptors are present in the PTML-MLP model, then, a molecule must have TSAD = 15 to belong to the AD of the PTML-MLP model. In our dataset, 3633 out of 3652 molecules/cases were within the AD of the PTML-MLP model (Supplementary Material 2); the deletion of the chemicals/cases outside the AD didn’t significantly affect the statistical performance of the PTML-MLP model.

Another advantage of the present PTML-MLP model is the ability to detect privileged molecular patterns. In doing so, our PTML-MLP model accurately predicted the inhibitory activity of well-established FDA-approved drugs for the treatment of mood disorders (Fig. 2) through selective inhibition of the proteins NET or SERT.

At the same time, our PTML-MLP model was able to correctly predict the drug named duloxetine (ChEMBL1175). This chemical is a dual-target inhibitor of NET and SERT (Fig. 3) used to treat major depressive disorder and other medical conditions such as neuropathic pain, generalized anxiety disorder, and others.

The PTML-MLP also predicts other dual-target inhibitors such as chlorpromazine (ChEMBL71); however, based on the experimental IC₅₀ values reported in the dataset used to build the PTML-MLP model, in terms of dual-target inhibition, chlorpromazine is considerably weaker than duloxetine. It can be seen that there are structural differences between duloxetine and the other selective inhibitors of NET and SERT mentioned above. In any case, the local metrics analyzed before, together with the illustrative examples discussed in Figs. 2 and 3, indicate the quality and capability of our PTML-MLP model in identifying both single- and dual-target inhibitors. The classification/prediction results of the drugs depicted in Figs. 2 and 3 can be found by searching the ChEMBL identifiers associated with each drug in Supplementary Material 2.

Physicochemical and structural interpretation of the PTML-MLP model

The application of the FBTD approach comprises two steps, (a) the physicochemical and structural interpretations of the D[TBI]cj descriptors present in the PTML-MLP model and (b) the design of new molecules using these interpretations as guidelines [27, 31, 49]. The purpose of this subsection was to apply the first of these steps to gather insights regarding the physicochemical properties and structural features that can be important for the appearance and/or enhancement of the dual-target activity of any chemical against NET and SERT. In this work, while interpreting the D[TBI]cj descriptors present in our PTML-MLP model, we have relied on their estimated sensitivity values (SVs), which are illustrated in Fig. 4.

We would like to emphasize that SVs quantify the relative importance of the input variables in a neural network model [50]. When applied to our PTML-MLP model, SVs rate the influence of the D[TBI]cj descriptors (input variables). On one side, the highest SVs are associated with those D[TBI]cj descriptors having the greatest relative influences (highest discriminatory powers) in the PTML-MLP model. On the other hand, from a more phenomenological point of view, the D[TBI]cj descriptors with the highest SVs are the ones whose physicochemical and structural information should be present in most of the molecules of the dataset used to build the PTML-MLP model. Such information is also essential for the future design of any new molecule with potential dual-target activity against NET and SERT proteins.

In addition to Fig. 4, we have applied a recently reported approach, which allowed us to calculate two class-based means for each D[TBI]cj descriptor (Table 4) from the cases/chemicals present in the training set [19, 27, 31, 48]. One of these means was determined by considering the chemicals correctly classified as active while the other was computed from the chemicals correctly identified as inactive. By comparing the two class-based means for each D[TBI]cj descriptor, we qualitatively estimated whether the value of a defined D[TBI]cj descriptor could be increased or decreased to enhance the activity against both NET and SERT.

Table 4 Propensity of variation of the D[TBI]cj descriptors

Full size table

Also, when interpreting the D[TBI]cj descriptors of our PTML-MLP model, we have associated each of them with certain subgraphs/generic fragments. In doing so, we have also given examples of how these subgraphs can be present in the form of molecular fragments such as specific atoms, functional groups, rings, and/or substructural moieties whose presence positively contributes to the versatile activity [19, 27, 48], in this case, the dual-target activity against the proteins NET and SERT (Fig. 5).

It is essential to emphasize that when interpreting the D[TBI]cj descriptors, it must not be expected that their values should be increased or decreased infinitely. This comes from the fact that the values of the D[TBI]cj descriptors have their limits (given by the AD of the PTML-MLP model discussed above). At the same time, the physicochemical and structural information in a defined D[TBI]cj descriptor is constrained by one or more D[TBI]cj descriptors; consequently, the number of molecular fragments (derived from those subgraphs) is neither expected to vary (increase or decrease) infinitely.

In our PTML- MLP model, we have six D[TBI]cj descriptors derived from the bond-based spectral moments [51,52,53,54,55,56], whose increments in their values characterize the degree of concentration of different physicochemical properties in regions of diverse size in a molecule. Consequently, it is possible to know specific regions of the molecules that can act through either polar interactions (e.g., hydrogen or halogen bonding) or London dispersion forces. These descriptors are DTBI01, DTBI02, DTBI07, DTBI08, DTBI09, and DTBI10, and they rank thirteen, fifth, seventh, eleven, fourteen, and fourth among the most influential D[TBI]cj descriptors in the PTML-MLP model, respectively. In the case of DTBI01, this indicates the relative decrease of the global hydrophobicity of a molecule as the sum of the hydrophobic contributions of all the bonds in a molecule (MG-01 subgraphs). Also, according to DTBI02, the molar refractivity in the MG-02 subgraphs/fragments is expected to decrease, which means that the number of atoms with high polarizability (aromatic carbons, S, P, and halogens except fluorine) should be reduced. Consequently, aliphatic portions (including rings) are preferred over aromatic moieties. If aromatic carbons are present, they should be part of heteroaromatic rings such as pyridine, pyrazine, pyrimidine, imidazole, and/or oxazole; the last two rings are associated with subgraph MG-08 (which contains MG-02). When analyzing DTBI07, it can be seen that the hydrophobicity in the MG-03 and MG-04 subgraphs should be decreased. Therefore, the presence of functional groups such as amides, carboxyl, and sulfone groups, as well as three-membered heterocycles (e.g., oxirane, aziridine, and aziridin-2-one) will favorably decrease the value of DTBI07.

According to the information present in DTBI08, the global polar surface area (contributions of the MG-01 subgraphs) should decrease while increasing the total number of atoms. This means that the number of functional groups acting as hydrogen bond donors should be reduced as much as possible. The same is valid for the number of sulfur atoms. In the case of DTBI09, we can say that this D[TBI]cj descriptor constraints DTBI02 since the former expresses that the number of atoms with high polarizability (MG-01 subgraphs) should be increased while increasing the total number of atoms in a molecule. By jointly interpreting DTBI09 and DTBI02, we can deduce that the presence of at least one aromatic ring and several aliphatic portions is very important for the enhancement of the dual-target activity. Also, electronegative atoms such as N, O, and halogens (mainly F and Cl) should be distributed throughout the entire chemical structure of a molecule. The analysis of DTBI10 suggests an increment in the number of MG-03 and MG-04 subgraphs, which have aliphatic carbons (i.e., isopropyl group attached to any atom) or substructures containing electronegative atoms that aren’t bonded with hydrogens (e.g., N, N-dimethylamino, N,N-dialkyl amide, difluoromethyl, trifluoromethyl, cyclopropane, and oxirane).

In our PTML-MLP model, we also have DTBI03, DTBI06, DTBI13, DTBI14, and DTBI15; they are the tenth, first, ninth, eighth, and third most significant D[TBI]cj descriptors, respectively. These five graph-based variables are derived from the bond-based connectivity index. Therefore, they are measures of the molecular volume [57,58,59]. In the case of DTBI03, this expresses the diminution of the number of six-membered rings (subgraphs of the type MG-12). Our inspection of the dataset used to build the present PTML-MLP model indicates that the number of six-membered rings should not exceed three; these rings should preferably be polysubstituted. The volume should be diminished by decreasing the number of moieties that contain MG-05 subgraphs. This aspect is accounted for by DTBI06. Also, reducing the linearity of a molecule by adding a polysubstituted ring or more ramifications in the central part of a molecule will favorably decrease the value of DTBI06. A similar effect on the diminution of the volume is observed when analyzing the descriptors DTBI13 and DTBI14; while the former measures the average molecular volume (subgraph MG-01), the latter considers the same property in MG-02 subgraphs. In the case of DTBI15, this indicates the augmentation of the molecular volume by increasing the number of MG-07, MG-09, and MG-10 subgraphs. Examples of specific functional groups and substructural moieties containing these fragments are trifluoromethyl, N,N-dialkylamino, and any other group (or atom) attached to any atom within a ring. We would like to highlight that MG-07, MG-09, and MG-10 can also refer to aliphatic portions.

The remaining four D[TBI]cj descriptors are derived from the atom-based connectivity index, and thus, they constitute measures of molecular accessibility [60, 61], i.e., the ability of different regions in a molecule to be available to interact with the surrounding medium. The D[TBI]cj descriptors with such information content are DTBI04, DTBI05, DTBI11, and DTBI12. These rank sixth, twelfth, second, and fifteenth among the most relevant descriptors in the PTML-MLP model, respectively. The information contained in DTBI04 implies an increase in the molecular accessibility in MG-02 subgraphs, where the presence of more than one methyl group and/or halogen in the periphery of the molecule is a highly beneficial factor. In the case of DTBI05, the favorable diminution of the value of this descriptor indicates the augmentation of the number of atoms able to form hydrogen bonds in MG-11 subgraphs. This means that within each MG-11 subgraph, at least two electronegative atoms (N and O) should be present. The information contained in DTBI05 is in some way constrained by the one present in DTBI11, i.e., in the latter, either the number of MG-11 subgraphs should be increased or the number of methyl groups and halogen atoms should be augmented. Last, DTBI12 characterizes the increase of MG-06 subgraphs, and examples of substructural moieties containing MG-06 are the fused ring systems and N, N-dialkyl amides, as well as the dimethyl amino, difluoromethyl, and trifluoromethyl groups attached to any ring.

Virtual design of dual-target inhibitors of NET and SERT

In this section, we applied the second step of the FBTD approach, which focused on using the joint physicochemical and structural interpretations of the D[TBI]cj descriptors as guidelines to design new molecules. The joint interpretation of the D[TBI]cj descriptors in the PTML-MLP model allowed us to consider the final characteristics that a molecule should possess to exhibit dual-target activity. Such characteristics are summarized as follows. Aliphatic portions and rings can appear in any part of a molecule. Any designed molecule should have a maximum of three (di- or trisubstituted) six-membered rings, with all of them being preferably located in the periphery of the molecule. Also, two of these three six-membered rings should be aliphatic. The central part of the molecule should contain either small ramifications (with each of them having a maximum of two bonds) or a polysubstituted five-membered heteroaromatic ring. The presence of at least two methyl groups (each of them attached to an electronegative atom such as nitrogen or oxygen) and/or at least one halogen (particularly Cl attached to an aromatic ring) in the periphery of the molecule can be a favorable factor. When possible, OHs and NHs should be avoided; if present, the sum of OHs and NHs should not exceed two. Electronegative atoms such as nitrogen and oxygen should be separated by two or more bonds (without counting bond multiplicity). A simple inspection of the dataset used to develop our PTML-MLP model suggests that the molecular weight of the designed molecules should not surpass 450 Da.

We connected and/or fused different subgraph-based molecular fragments (e.g., functional groups, rings, and moieties derived from the subgraphs mentioned above). As a result, four molecules were designed (Fig. 6). It is important to highlight that the subgraph-based molecular fragments used to design the four molecules complied with the condition that their presence was beneficial for the favorable variation (increase or decrease) of the value of more than one D[TBI]cj descriptor in the PTML-MLP model.

A summary of the predictions performed by the PTML-MLP model regarding the activity profile of the designed molecules appears in Table 5 while all the details can be found in Supplementary Material 3.

Table 5 The designed molecules and their predicted dual-target activities

Full size table

The numbers represented in Table 5 are the predicted probabilities for each designed molecule to be identified as active under each of the six experimental conditions reported here (see Table 3). We considered a molecule to have dual-target inhibitory activity if its predicted probability was higher than 50% in at least 2 of 4 and at least 1 of 2 experimental conditions cj for NET and SERT, respectively. Based on this criterion, we can say that all the designed molecules behave as dual-target inhibitors. Molecule AMD-01 was the one being predicted in a lower number of experimental conditions cj (3 of the 6). In this sense, we would like to highlight that although AMD-01 is very similar to AMD-02, with the former having a trifluoromethyl group in position 5 of the pyrazine ring and the latter containing an N, N-dimethylamino group in the same position. This means that the N, N-dimethylamino group seems to be more appropriate since it favorably decreases the local and global hydrophobicity (explained by the descriptors DTBI01 and DTBI07, respectively) while increasing the ability of the molecule to interact through London dispersion forces (characterized by the DTBI12). In general, from the molecule AMD-02 to AMD-04, chemical modifications such as the replacement of certain heteroatoms (e.g., replacement of the pyrrolic nitrogen by oxygen or sulfur) or heterocycles in positions 2, 4, and 5 of the central five-membered rings, can either maintain or improve the dual-target activity. As depicted in Table 5, among the designed molecules, the most suitable chemical is AMD-03 because it was predicted as active in 5 of the 6 experimental conditions cj. A great contributor to the dual-target activity of AMD-03 is the fragment 4,6-dichloropyrimidine. Considering other fragments in the same position and comparing the values of all the D[TBI]cj descriptors among the designed molecules, the 4,6-dichloropyrimidine fragment gives AMD-03 the best balance of hydrophobicity, polarizability, and molecular volume. Consequently, the 4,6-dichloropyrimidine fragment in AMD-03 should be able to interact more effectively through London dispersion forces (aromatic carbons and chlorine) and polar interactions (e.g., hydrogen bonds through the pyridinic nitrogen atoms).

The previous subsections have shown that our PTML-MLP model, when combined with the FBTD approach, can be used to design molecules potentially exhibiting dual-target inhibition against NET and SERT. In any case, we wanted to assess the structural novelty of the designed molecules. Here, our purpose was to gain insights into whether the molecules designed in this work could represent new molecular entities. To do so, we performed a search in prestigious databases such as ChEMBL [37, 38, 62], ZINC [63], eMolecules [64], and SureChEMBL [65]. When searching in the aforementioned databases to find molecules similar to the ones designed by us, we used a similarity cutoff of 80%; under this condition, we could confirm that there were no molecules whose chemical structures resembled the ones designed in this work. This demonstrates that the joint use of our PTML model and the FBTD approach enables the generation of molecules, which, besides virtually exhibiting dual-target activity against NET and SERT, also constitute new chemotypes for future synthesis and biological evaluation in the context of mood disorders.

Druglikeness and ADMET properties of the designed molecules

In addition to the novelty and the promising dual-target potency predicted for the designed molecules by the PTML-MLP model, we wanted to assess their druglikeness and properties related to their absorption, distribution, metabolism, elimination, and toxicity (ADMET) profiles. In this sense, we calculated several global physicochemical properties (Table 6).

Table 6 Physicochemical properties calculated for the designed molecules

Full size table

We employed the software AlvaDesc v1.0.22 [66] to perform the aforementioned calculations. The idea here was to compare the estimated values of the physicochemical properties of the designed molecules with the corresponding cutoff values established by Lipinski’s rule of five [67], Ghose’s filter [68], and Veber’s rules [69]. The analysis of the results in Table 6 indicates that the four designed molecules comply with the three druglikeness-based guidelines, indicating that they are likely to have acceptable oral bioavailability.

On the other hand, we used the systematic evaluation module of the web server named ADMETLab [70], which allowed us to calculate 31 pharmacokinetic and toxicity endpoints for each of the designed molecules (Supplementary Material 4). Summarizing, regarding the absorption, acceptable values of Papp (Caco-2) permeability, human intestinal absorption, and oral bioavailability were reported for the four designed molecules. In terms of distribution, the molecules were predicted to exhibit values of plasma protein binding (PPB) lower than 85%, as well as volumes of distribution in the desirable range (0.04–20 L/kg). It should be highlighted that the four designed molecules exhibited adequate blood-brain barrier (BBB) permeability. Notice that BBB permeability is a very important property for those chemicals aiming to act on the central nervous system as is the case of the drugs for the treatment of mood disorders. When analyzing the metabolism, the designed molecules exhibited moderate to low levels of promiscuity when predicted against the five major cytochromes P450 (CYP) enzymes, namely CYP1A2, CYP3A4, CYP2C9, CYP2C19, and CYP2D6. In general, the designed molecules were predicted as poor inhibitors of these CYPs, while also being estimated mainly as substrates of CYP1A2, CYP3A4, and CYP2C19. Regarding the elimination profiles, the designed molecules were predicted to exhibit low clearance. In terms of toxicity, a property of concern is that the designed molecules may behave as hERG blockers. However, regarding this toxic profile, the developers of the web server used the rigorous cutoff value IC₅₀ < 40 µM (usually, the cutoff for off-target toxicity is IC₅₀ ≤ 10 µM). Therefore, it cannot be concluded that our designed molecules could exhibit cardiotoxicity due to hERG inhibition. On the other hand, although a certain degree of hepatotoxicity was predicted for the designed molecules, it was also estimated that they would not cause drug-induced liver injury. In the case of mutagenicity, skin sensitization, and in vivo acute toxicity, the designed molecules (except AMD-01 predicted to exhibit certain in vivo toxicity) were predicted as safe chemicals. Altogether, the relatively acceptable pharmacokinetic and toxicity profiles predicted by ADMETLab web server indicate that the molecules designed by us in this work deserve further exploration at the experimental level in the context of early drug discovery for mood disorders.

Conclusions

Mood disorders are a group of complex medical conditions whose treatment based on single-target drugs has proven to be ineffective. New chemicals used as dual- or multi-target therapeutic agents could provide better efficacy. Our PTML-MLP model created in this work has been a promising attempt to accelerate the discovery of chemicals with the potential to tackle mood disorders by acting as dual-target inhibitors of the NET and SERT proteins. When compared to other works reported in the field, our study provides deeper insights regarding the physicochemical aspects and structural features which can be essential for the appearance and enhancement of the dual-target activity against the aforementioned proteins. In the context of treatments for mood disorders, our PTML-MLP model is beneficial for basic science because it can accelerate the early discovery of novel chemicals with dual-target activity. However, due to its good predictive power, our PTML-MLP model could also prove useful in translational research as a tool for drug repurposing, thus identifying FDA-approved drugs with different therapeutic indications that have the potential to treat mood disorders. The present work opens new horizons on the development and application of interpretable machine learning models in different therapeutic areas.

Data availability

Data is provided within the manuscript and supplementary material files.

Change history

28 April 2025
The original online version of this article was revised: The ORCID IDs are correctly given for the authors.

References

GBD(2019 Diseases and Injuries Collaborators). Global burden of 369 diseases and injuries in 204 countries and territories, 1990–2019: a systematic analysis for the global burden of Disease Study 2019. Lancet. 2020;396:1204–22.
Martin-Key NA, Olmert T, Barton-Owen G, Han SYS, Cooper JD, Eljasz P, et al. The Delta Study - Prevalence and characteristics of mood disorders in 924 individuals with low mood: results of the of the World Health Organization Composite International Diagnostic Interview (CIDI). Brain Behav. 2021;11:e02167.
Article PubMed PubMed Central Google Scholar
Greenberg PE, Fournier AA, Sisitsky T, Simes M, Berman R, Koenigsberg SH, et al. The economic burden of adults with major depressive disorder in the United States (2010 and 2018). PharmacoEconomics. 2021;39:653–65.
Article PubMed PubMed Central Google Scholar
Rush AJ, Trivedi MH, Wisniewski SR, Nierenberg AA, Stewart JW, Warden D, et al. Acute and longer-term outcomes in depressed outpatients requiring one or several treatment steps: a STAR*D report. AJ Psychiatry. 2006;163:1905–17.
Google Scholar
Stolk RP, Rosmalen JG, Postma DS, de Boer RA, Navis G, Slaets JP, et al. Universal risk factors for multifactorial diseases: LifeLines: a three-generation population-based study. Eur J Epidemiol. 2008;23:67–74.
Article PubMed Google Scholar
Schaduangrat N, Lampa S, Simeon S, Gleeson MP, Spjuth O, Nantasenamat C. Towards reproducible computational drug discovery. J Cheminformatics. 2020;12:9.
Article Google Scholar
Liu H, Wu Y, Li C, Tang Q, Zhang YW. Molecular docking and biochemical validation of (-)-syringaresinol-4-O-beta-D-apiofuranosyl-(1–>2)-beta-D-glucopyranoside binding to an allosteric site in monoamine transporters. Front Pharmacol. 2022;13:1018473.
Article CAS PubMed PubMed Central Google Scholar
Gaber A, Alsanie WF, Alhomrani M, Alamri AS, Alyami H, Shakya S, et al. Multispectral and molecular Docking studies reveal potential effectiveness of antidepressant fluoxetine by forming pi-acceptor complexes. Molecules. 2022;27:5883.
Article CAS PubMed PubMed Central Google Scholar
Zhang TT, Xue R, Wang X, Zhao SW, An L, Li YF, et al. Network-based drug repositioning: a novel strategy for discovering potential antidepressants and their mode of action. Eur Neuropsychopharmacol. 2018;28:1137–50.
Article CAS PubMed Google Scholar
Avram S, Stan MS, Udrea AM, Buiu C, Boboc AA, Mernea M. 3D-ALMOND-QSAR models to predict the antidepressant effect of some natural compounds. Pharmaceutics. 2021;13:1449.
Article CAS PubMed PubMed Central Google Scholar
Islas AA, Moreno LG, Scior T. Induced fit, ensemble binding space docking and Monte Carlo simulations of MDMA ‘ecstasy’ and 3D pharmacophore design of MDMA derivatives on the human serotonin transporter (hSERT). Heliyon. 2021;7:e07784.
Article CAS PubMed PubMed Central Google Scholar
Wang YQ, Lin WW, Wu N, Wang SY, Chen MZ, Lin ZH, et al. Structural insight into the serotonin (5-HT) receptor family by molecular docking, molecular dynamics simulation and systems pharmacology analysis. Acta Pharmacol Sin. 2019;40:1138–56.
Article PubMed PubMed Central Google Scholar
Olasupo SB, Uzairu A, Shallangwa G, Uba S. QSAR analysis and molecular docking simulation of norepinephrine transporter (NET) inhibitors as anti-psychotic therapeutic agents. Heliyon. 2019;5:e02640.
Article PubMed PubMed Central Google Scholar
Olasupo SB, Uzairu A, Shallangwa GA, Uba S. Chemoinformatic studies on some inhibitors of dopamine transporter and the receptor targeting schizophrenia for developing novel antipsychotic agents. Heliyon. 2020;6:e04464.
Article PubMed PubMed Central Google Scholar
Qu SY, Li XY, Heng X, Qi YY, Ge PY, Ni SJ, et al. Analysis of antidepressant activity of Huang-Lian Jie-Du Decoction through Network Pharmacology and Metabolomics. Front Pharmacol. 2021;12:619288.
Article CAS PubMed PubMed Central Google Scholar
Gonzalez-Diaz H, Arrasate S, Gomez-SanJuan A, Sotomayor N, Lete E, Besada-Porto L, et al. General theory for multiple input-output perturbations in complex molecular systems. 1. Linear QSPR electronegativity models in physical, organic, and medicinal chemistry. Curr Top Med Chem. 2013;13:1713–41.
Article CAS PubMed Google Scholar
Speck-Planche A, Cordeiro MNDS. Multitasking models for quantitative structure-biological effect relationships: current status and future perspectives to speed up drug discovery. Expert Opin Drug Discov. 2015;10:245–56.
Article CAS PubMed Google Scholar
Halder AK, Moura AS, Cordeiro MNDS. Moving average-based Multitasking in Silico classification modeling: where do we stand and what is Next? Int J Mol Sci. 2022;23:4937.
Article PubMed PubMed Central Google Scholar
Kleandrova VV, Speck-Planche A. PTML modeling for pancreatic Cancer Research: in Silico Design of Simultaneous Multi-protein and Multi-cell inhibitors. Biomedicines. 2022;10:491.
Article CAS PubMed PubMed Central Google Scholar
Santana R, Zuluaga R, Ganan P, Arrasate S, Onieva E, Montemore MM, et al. PTML Model for Selection of Nanoparticles, anticancer drugs, and vitamins in the design of drug-vitamin nanoparticle Release systems for Cancer Cotherapy. Mol Pharm. 2020;17:2612–27.
Article CAS PubMed Google Scholar
Kleandrova VV, Scotti MT, Scotti L, Nayarisseri A, Speck-Planche A. Cell-based multi-target QSAR model for design of virtual versatile inhibitors of liver cancer cell lines. SAR QSAR Environ Res. 2020;31:815–36.
Article CAS PubMed Google Scholar
Speck-Planche A. Multicellular target QSAR Model for Simultaneous Prediction and Design of Anti-pancreatic Cancer agents. ACS Omega. 2019;4:3122–32.
Article CAS Google Scholar
Bediaga H, Arrasate S, Gonzalez-Diaz H. PTML Combinatorial Model of ChEMBL compounds assays for multiple types of Cancer. ACS Comb Sci. 2018;20:621–32.
Article CAS PubMed Google Scholar
Dieguez-Santana K, Casanola-Martin GM, Torres R, Rasulev B, Green JR, Gonzalez-Diaz H. Machine learning study of metabolic networks vs ChEMBL Data of Antibacterial compounds. Mol Pharm. 2022;19:2151–63.
Article CAS PubMed PubMed Central Google Scholar
Ortega-Tenezaca B, Gonzalez-Diaz H. IFPTML mapping of nanoparticle antibacterial activity vs. pathogen metabolic networks. Nanoscale. 2021;13:1318–30.
Article CAS PubMed Google Scholar
Urista DV, Carrue DB, Otero I, Arrasate S, Quevedo-Tumailli VF, Gestal M, et al. Prediction of Antimalarial Drug-decorated nanoparticle Delivery systems with Random Forest models. Biology (Basel). 2020;9:198.
CAS PubMed Google Scholar
Speck-Planche A, Kleandrova VV. Multi-condition QSAR Model for the virtual design of chemicals with dual pan-antiviral and anti-cytokine storm profiles. ACS Omega. 2022;7:32119–30.
Article CAS PubMed PubMed Central Google Scholar
Kleandrova VV, Scotti MT, Speck-Planche A. Computational drug repurposing for Antituberculosis Therapy: Discovery of Multi-strain inhibitors. Antibiot (Basel). 2021;10:1005.
Article CAS Google Scholar
Diez-Alarcia R, Yanez-Perez V, Muneta-Arrate I, Arrasate S, Lete E, Meana JJ, et al. Big Data challenges Targeting proteins in GPCR Signaling pathways; combining PTML-ChEMBL models and [(35)S]GTPgammaS binding assays. ACS Chem Neurosci. 2019;10:4476–91.
Article CAS PubMed Google Scholar
Kleandrova VV, Speck-Planche A. PTML modeling for Alzheimer’s Disease: design and prediction of virtual Multi-target inhibitors of GSK3B, HDAC1, and HDAC6. Curr Top Med Chem. 2020;20:1661–76.
Article CAS PubMed Google Scholar
Speck-Planche A. Combining ensemble learning with a fragment-based Topological Approach to generate New Molecular Diversity in Drug Discovery: in Silico Design of Hsp90 inhibitors. ACS Omega. 2018;3:14704–16.
Article CAS PubMed PubMed Central Google Scholar
Kleandrova VV, Cordeiro MNDS, Speck-Planche A. Optimizing drug discovery using multitasking models for quantitative structure-biological effect relationships: an update of the literature. Expert Opin Drug Discov. 2023;18:1231–43.
Article PubMed Google Scholar
Speck-Planche A, Kleandrova VV. Demystifying Artificial neural networks as generators of New Chemical Knowledge: Antimalarial Drug Discovery as a case study. In: Cartwright HM, editor. Machine learning in Chemistry: the impact of Artificial Intelligence. London, United Kingdom: The Royal Society of Chemistry; 2020. pp. 398–423.
Chapter Google Scholar
Stachowicz K, Sowa-Kucma M. The treatment of depression - searching for new ideas. Front Pharmacol. 2022;13:988648.
Article CAS PubMed PubMed Central Google Scholar
Mathew SJ, Manji HK, Charney DS. Novel drugs and therapeutic targets for severe mood disorders. Neuropsychopharmacology. 2008;33:2080–92.
Article CAS PubMed Google Scholar
Kleandrova VV, Scotti L, Bezerra Mendonça Junior FJ, Muratov E, Scotti MT, Speck-Planche A. QSAR modeling for Multi-target Drug Discovery: Designing simultaneous inhibitors of proteins in diverse pathogenic parasites. Front Chem. 2021;9:634663.
Article CAS PubMed PubMed Central Google Scholar
Gaulton A, Bellis LJ, Bento AP, Chambers J, Davies M, Hersey A, et al. ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res. 2012;40:D1100–7.
Article CAS PubMed Google Scholar
Mendez D, Gaulton A, Bento AP, Chambers J, De Veij M, Felix E, et al. ChEMBL: towards direct deposition of bioassay data. Nucleic Acids Res. 2019;47:D930–40.
Article CAS PubMed Google Scholar
Estrada E, Gutiérrez Y. MODESLAB. v1.5. Santiago de Compostela, Spain; 2002.
Speck-Planche A, Kleandrova VV, Scotti MT. Silico Drug Repurposing for anti-inflammatory therapy: virtual search for dual inhibitors of Caspase-1 and TNF-Alpha. Biomolecules. 2021;11:1832.
Article CAS PubMed PubMed Central Google Scholar
Urias RW, Barigye SJ, Marrero-Ponce Y, Garcia-Jacas CR, Valdes-Martini JR, Perez-Gimenez F. IMMAN: free software for information theory-based chemometric analysis. Mol Divers. 2015;19:305–19.
Article CAS PubMed Google Scholar
Wassermann AM, Nisius B, Vogt M, Bajorath J. Identification of descriptors capturing compound class-specific features by mutual information analysis. J Chem Inf Model. 2010;50:1935–40.
Article CAS PubMed Google Scholar
TIBCO-Software-Inc. STATISTICA (Data Analysis Software System), v13.5.0.17. Palo Alto, California, USA, 2018.
Kleandrova VV, Cordeiro MNDS, Speck-Planche A. Perturbation theory machine learning model for phenotypic early antineoplastic drug Discovery: design of virtual Anti-lung-cancer agents. Appl Sci. 2024;14:9344.
Article CAS Google Scholar
Chicco D, Jurman G. The Matthews correlation coefficient (MCC) should replace the ROC AUC as the standard metric for assessing binary classification. BioData Min. 2023;16:4.
Article PubMed PubMed Central Google Scholar
Schneider G, Wrede P. Artificial neural networks for computer-based molecular design. Prog Biophys Mol Biol. 1998;70:175–222.
Article CAS PubMed Google Scholar
Manallack DT, Livingstone DJ, A-Razzak M, Glen RC. Neural Networks and Expert Systems in Molecular Design. In: van de Waterbeemd H, ed. Advanced Computer-Assisted Techniques in Drug Discovery, 1994:293–331.
Kleandrova VV, Scotti MT, Scotti L, Speck-Planche A. Multi-target Drug Discovery Via PTML modeling: applications to the design of virtual dual inhibitors of CDK4 and HER2. Curr Top Med Chem. 2021;21:661–75.
Article CAS PubMed Google Scholar
Kleandrova VV, Speck-Planche A. The QSAR paradigm in fragment-based drug Discovery: from the virtual generation of target inhibitors to Multi-scale modeling. Mini Rev Med Chem. 2020;20:1357–74.
Article CAS PubMed Google Scholar
Zhou X, Lin H, Lin H. Global sensitivity analysis. In: Shekhar S, Xiong H, editors. Encyclopedia of GIS. Boston, MA: Springer US; 2008. pp. 408–9.
Chapter Google Scholar
Estrada E. Spectral moments of the edge adjacency matrix in molecular graphs. 1. Definition and applications for the prediction of physical properties of alkanes. J Chem Inf Comput Sci. 1996;36:844–49.
Article CAS Google Scholar
Estrada E. Spectral moments of the edge adjacency matrix in molecular graphs. 2. Molecules containing heteroatoms and QSAR applications. J Chem Inf Comput Sci. 1997;37:320–28.
Article CAS Google Scholar
Estrada E. Spectral moments of the edge adjacency matrix in molecular graphs. 3. Molecules containing cycles. J Chem Inf Comput Sci. 1998;38:23–7.
Article CAS Google Scholar
Estrada E. How the parts organize in the whole? A top-down view of molecular descriptors and properties for QSAR and drug design. Mini Rev Med Chem. 2008;8:213–21.
Article CAS PubMed Google Scholar
Estrada E, Patlewicz G, Gutierrez Y. From knowledge generation to knowledge archive. A general strategy using TOPS-MODE with DEREK to formulate new alerts for skin sensitization. J Chem Inf Comput Sci. 2004;44:688–98.
Article CAS PubMed Google Scholar
Estrada E, Molina E. Automatic extraction of structural alerts for predicting chromosome aberrations of organic compounds. J Mol Graph Model. 2006;25:275–88.
Article CAS PubMed Google Scholar
Estrada E. Edge adjacency relationship and a novel topological index related to molecular volume. J Chem Inf Comput Sci. 1995;35:31–3.
Article CAS Google Scholar
Estrada E. Edge adjacency relationships in molecular graphs containing heteroatoms: a new topological index related to molar volume. J Chem Inf Comput Sci. 1995;35:701–7.
Article CAS Google Scholar
Estrada E, Rodríguez L. Edge-connectivity indices in QSPR/QSAR studies. 1. Comparison to other Topological indices in QSPR studies. J Chem Inf Comput Sci. 1999;39:1037–41.
Article CAS Google Scholar
Estrada E. Physicochemical Interpretation of Molecular Connectivity Indices. J Phys Chem A. 2002;106:9085–91.
Article CAS Google Scholar
Randić M, Zupan J. On interpretation of well-known topological indices. J Chem Inf Comput Sci. 2001;41:550–60.
Article PubMed Google Scholar
Overington J, ChEMBL. An interview with John Overington, team leader, chemogenomics at the European Bioinformatics Institute Outstation of the European Molecular Biology Laboratory (EMBL-EBI). Interview by Wendy A. Warr. J Comput Aided Mol Des. 2009;23:195–8.
Article PubMed Google Scholar
Irwin JJ, Shoichet BK. ZINC–a free database of commercially available compounds for virtual screening. J Chem Inf Model. 2005;45:177–82.
Article CAS PubMed PubMed Central Google Scholar
Hersey A, Chambers J, Bellis L, Patricia Bento A, Gaulton A, Overington JP. Chemical databases: curation or integration by user-defined equivalence? Drug Discov Today Technol. 2015;14:17–24.
Article PubMed PubMed Central Google Scholar
Papadatos G, Davies M, Dedman N, Chambers J, Gaulton A, Siddle J, et al. SureChEMBL: a large-scale, chemically annotated patent document database. Nucleic Acids Res. 2016;44:D1220–8.
Article CAS PubMed Google Scholar
Mauri A, alvaDesc. A Tool to calculate and analyze molecular descriptors and fingerprints. In: Roy K, editor. Ecotoxicological QSARs. New York, NY: Springer US; 2020. pp. 801–20.
Chapter Google Scholar
Lipinski CA, Lombardo F, Dominy BW, Feeney PJ. Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Adv Drug Deliv Rev. 2001;46:3–26.
Article CAS PubMed Google Scholar
Ghose AK, Viswanadhan VN, Wendoloski JJ. A knowledge-based approach in designing combinatorial or medicinal chemistry libraries for drug discovery. 1. A qualitative and quantitative characterization of known drug databases. J Comb Chem. 1999;1:55–68.
Article CAS PubMed Google Scholar
Veber DF, Johnson SR, Cheng HY, Smith BR, Ward KW, Kopple KD. Molecular properties that influence the oral bioavailability of drug candidates. J Med Chem. 2002;45:2615–23.
Article CAS PubMed Google Scholar
Dong J, Wang NN, Yao ZJ, Zhang L, Cheng Y, Ouyang D, et al. ADMETlab: a platform for systematic ADMET evaluation based on a comprehensively collected ADMET database. J Cheminformatics. 2018;10:29.
Article Google Scholar

Download references

Acknowledgements

Not applicable.

Funding

This work was financially supported by FCT/MCTES through national funds (grant UIDB/50006/2020).

Author information

Authors and Affiliations

LAQV@REQUIMTE/Department of Chemistry and Biochemistry, Faculty of Sciences, University of Porto, Porto, 4169-007, Portugal
Valeria V. Kleandrova, M. Natália D. S. Cordeiro & Alejandro Speck-Planche

Authors

Valeria V. Kleandrova
View author publications
You can also search for this author inPubMed Google Scholar
M. Natália D. S. Cordeiro
View author publications
You can also search for this author inPubMed Google Scholar
Alejandro Speck-Planche
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

Conceptualization, A.S.-P.; methodology, A.S.-P.; software, A.S.-P. and V.V.K.; validation, A.S.-P.; formal analysis, A.S.-P.; investigation, A.S.-P., V.V.K., and M.N.D.S.C.; resources, A.S.-P., V.V.K., and M.N.D.S.C.; data curation, A.S.-P. and V.V.K.; writing—original draft preparation, A.S.-P., V.V.K., and M.N.D.S.C.; writing—review and editing, A.S.-P.; visualization, A.S.-P. and V.V.K.; supervision, A.S.-P.; project administration, A.S.-P. and V.V.K.; funding acquisition, M.N.D.S.C.

Corresponding author

Correspondence to Alejandro Speck-Planche.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1: D[TBI]cj descriptors, classification results, local metrics, and applicability domain.

13065_2024_1376_MOESM2_ESM.xlsx

Supplementary Material 2: Topological descriptors (TIs and NTIs), D[TBI]cj descriptors, classification results, and applicability domain for the designed molecules.

13065_2024_1376_MOESM3_ESM.xlsx

Supplementary Material 3: Topological descriptors (TIs and NTIs), averages avg[TBI]cj, and standard deviation values Std[TBI].

Supplementary Material 4

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Kleandrova, V.V., Cordeiro, M.N.D.S. & Speck-Planche, A. Perturbation-theory machine learning for mood disorders: virtual design of dual inhibitors of NET and SERT proteins. BMC Chemistry 19, 2 (2025). https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s13065-024-01376-z

Download citation

Received: 04 October 2024
Accepted: 27 December 2024
Published: 02 January 2025
DOI: https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s13065-024-01376-z

Perturbation-theory machine learning for mood disorders: virtual design of dual inhibitors of NET and SERT proteins

Abstract

Introduction