Traynor's blog
https://blogs.warwick.ac.uk/ctraynor
This blog will shed light into complicated problems in mathematics, statistics, biology and physics. The main topics will be the application of Machine Learning techniques and other Statistical Elements.en-GB(C) 2021 Carlos Serra Traynorhttps://blogs.law.harvard.edu/tech/rssCarlos Serra TraynorCarlos Serra TraynorWarwick Blogs, University of Warwick, https://blogs.warwick.ac.uk120Markov Chain Monte Carlo made easy: Gibbs sampling. by Carlos Serra Traynor
https://blogs.warwick.ac.uk/ctraynor/entry/step-by-step_gibbs_sampling/
<p>In a previous <a href="http://blogs.warwick.ac.uk/ctraynor/entry/a_low-tech_monte/">post</a> we've introduced Monte Carlo techniques and hint its many applications. In another <a href="http://blogs.warwick.ac.uk/ctraynor/entry/weather_forecast_with/">post</a>we've shown a simple Markov Chain. The core of Markov Chain Monte Carlo methods is coming up with a function that makes a probabilistic choice about what state to go next in a Markov Chain. So that, similarly to the <a href="http://blogs.warwick.ac.uk/ctraynor/entry/a_low-tech_monte/">pi example</a>, each state is visitated in proportion to the target function, as a result of that estimating the desired parameters. A Gibbs sampling is just a method that does have these requisites.</p>
<p>The central idea in Gibbs sampling is that, instead of jumping to the next state at once, a separate small (probabilistic) jump is made for each parameter (k) in the model, where each choice depend on all the other parameters. The algorithm is given by:</p>
<p><img src="https://blogs.warwick.ac.uk/images/ctraynor/2018/10/20/gibs.png?maxWidth=500" alt="gibs" border="0" /></p>
<p>with regard to z are the <em>k</em> parameters in the model, and T are the transitions or times that the model is sampled.</p>
<p>To sum up, Gibbs sampling walks through a k-dimensional state space. Every point in the walk is a collection of values for the random variables Z.</p>GibbsSamplingSat, 20 Oct 2018 18:42:35 GMTCarlos Serra Traynorhttps://blogs.warwick.ac.uk/ctraynor/entry/step-by-step_gibbs_sampling/#comments8a1785816541e97601669295d661035a0Weather forecast with a Markov Chain by Carlos Serra Traynor
https://blogs.warwick.ac.uk/ctraynor/entry/weather_forecast_with/
<p>Markov Chains are a computational tool useful for modelling systems made up of linked events. For example, take a simple weather forecast with three states: rainy, sunny, and cloudy. If we'd think about it we'll soon realise that the weather forecast system behaves differently to the tossing coin example given at basic statistical courses. In this real world example, the events are not independent of the previous state. For example, that today has been rainy might signal that tomorrow is gonna be rainy (specially if you live in England!). An structure of a Markov Chain for the weather forecast system can be build by drawing dots (aka states) and arrows (aka transitions), here is an example with made up probabilities for rainy, sunny and cloudy:</p>
<p><img src="/images/ctraynor/2018/10/20/mcdiagram.png?maxWidth=500" alt="mcdiagram.png" border="0" /></p>
<p>Double-check that because arrows are probabilities the number associated with them must lie between 0 and 1, (and the sum of all the arrows stemming from a dot must add up to 1!). </p>
<p>OK, now we have written down the Markov diagram we can fairly easily check the probability of tomorrows raining given that today is raining, 0.5. Besides, we may also be interested in which is the probability of rain in two day if it's cloudy today. To do so, we'd add up all the paths that lead from cloudy today to rainy in two days, which amounts to 0.42. This will soon become cumbersome to calculate by heart, that's why it's so convenient to arrange the Markov Chain in a matrix P that predicts tomorrow's weather, and use matrix arithmetic to calculate the day after tomorrow's weather, given by PXP = P^2 </p>
<p><img src="https://blogs.warwick.ac.uk/images/ctraynor/2018/10/20/mcpath.png?maxWidth=500" alt="MCpathway" border="0" /></p>
<p>Similarly, P3 would give the probabilities for three days, and so on. "The entire future unfolds from this one matrix".</p>
<p>Given, the simple example given the successive powers of the matrix rapidly converge to a configuration in where all the columns and rows remain stationary:</p>
<p><img src="/images/ctraynor/2018/10/20/mcmatrix.png?maxWidth=500" alt="mcmatrix.png" border="0" /></p>
<p>There is a simple interpretation for that behaviour of the Chain. If we let the system evolve long enough the probability of a given state no longer depends on the initial state. In other words, knowing that today today is rainy may offer a clue on tomorrow's weather, but it's not much helpful in predicting the weather in one month. For such an extended forecast, we may as well consult the long-term averages, which is the values where the Markov Chain converges.</p>
<p>It's a pleasure introduce Markov Chains, but if you're looking for more information check out this resource where I have adapted the example from: First Links in the Markov Chain (https://raichev.net/markov/misc/markov_chain.pdf) .</p>ChainMarkovSat, 20 Oct 2018 11:24:09 GMTCarlos Serra Traynorhttps://blogs.warwick.ac.uk/ctraynor/entry/weather_forecast_with/#comments8a1785816541e976016690fc64e203400A low-tech Monte Carlo technique to approximate π by Carlos Serra Traynor
https://blogs.warwick.ac.uk/ctraynor/entry/a_low-tech_monte/
<p>Monte Carlo algorithms are used in solving involved integrals with no close-form solution. For no mathematicians this first sentence may have already appeared difficult and cumbersome. However, we should think of Monte Carlo techniques as a powerful ally in real difficult problems. Lets explore an easy example of Monte Carlo technique to get familiar with it. Suppose that you'd like to estimate the value of π. Draw the following perfect square on the ground and inscribe a circle in it:</p>
<p><img src="https://blogs.warwick.ac.uk/images/ctraynor/2018/10/18/circle00.png?maxWidth=500" alt="circle00" border="0" /><br />
</p>
<p>Now take a bag of rice and scatter 20 grains uniformly at randominside the square:</p>
<p><img src="https://blogs.warwick.ac.uk/images/ctraynor/2018/10/18/circle20.png?maxWidth=500" alt="circle20" border="0" /></p>
<p>Now assuming that the scattering was random the ratio between the circle's grains (C) and the square's grains (S) should approximate the ratio between the are of the circle and the are of the square given by:</p>
<p>C/S = π(d/2)^2/d^2</p>
<p>Solving for π we get:</p>
<p>π ~ 4C/S</p>
<p>Which in the approximation of our example is: 4*15/20 = 60/20 = 3.</p>
<p>We have approximated the value of π to be 3, not too bad for a Monte Carlo simulation with only 20 random points.</p>
<p><br />
</p>
<p>(The figure was adapted from https://towardsdatascience.com/a-zero-math-introduction-to-markov-chain-monte-carlo-methods-dcba889e0c50, and the text from GIBBS SAMPLING FOR THE UNINITIATED, go visit these resources if you'd like to learn more about MCMC)</p>CarloMonteTechniquesThu, 18 Oct 2018 09:05:49 GMTCarlos Serra Traynorhttps://blogs.warwick.ac.uk/ctraynor/entry/a_low-tech_monte/#comments8a1784e66541eae60166866c81a302cd0QSP for AMR: Modelling how the drugs get into the bugs by Carlos Serra Traynor
https://blogs.warwick.ac.uk/ctraynor/entry/qsp-uk_network_satellite/
<p class="answer">Writing about web page <a href="https://warwick.ac.uk/fac/sci/eng/qsp-uk_network_satellite_conference" title="Related external link: https://warwick.ac.uk/fac/sci/eng/qsp-uk_network_satellite_conference">https://warwick.ac.uk/fac/sci/eng/qsp-uk_network_satellite_conference</a></p>
<p><b>What's quantitative & systems pharmacology?</b></p>
<p>“Quantitative and Systems Pharmacology (QSP) is an emerging discipline focused on identifying and validating drug targets, understanding existing therapeutics and discovering new ones.”-Quantitative and Systems Pharmacology in the Post-genomic Era: New Approaches to Discovering Drugs and Understanding Therapeutic Mechanisms.</p>
<p><b>AMR is gaining increasing importance in healthcare settings. But what’s AMR?</b></p>
<p>Antibiotics interfere with the complex “machinery” inside the bacteria for example by interfering with its metabolism, slowing down their growth significantly, so they are less of a thread. Other antibiotics target DNA which prevents from replicating and prevent bacteria from multiplying ultimate killing them. Or by simply reaping the outer layer of bacteria to shred so their inside spill out dying quickly all of this without bothering body cells.</p>
<p>But now evolution is making things more complicated, by small random change, a small amount of the bacteria might find a way to protect themselves. For example, by intercepting the antibiotic and change the molecule so it becomes harmless or by investing energy in pumps that eject the antibiotic before they can do damage. </p>
<p>Bacteria have two kinds of DNA the chromosome and small floating parts called plasmid with which they can exchange useful immunities or in a process called transformation bacteria can harvest dead bacteria and collect DNA pieces. This even works between different bacteria species and can lead to superbugs: bacteria that are immune to multiple kinds of antibiotics. A variety of superbugs already exist in the world especially hospitals are the perfect breeding grounds for them.</p>
<p>As a society, we have to change habits on the use of antibiotics and keep them as a last resort drug. In addition, interdisciplinary research is needed to keep developing new antibiotics. </p>
<p><strong>Additional topics:</strong></p>
<p><b>XChem: new experimental opportunities for testing theory</b></p>
<p>This team is making breakthrough discoveries in the fields of macromolecular crystallography, imaging and microscopy, biological cryo-imaging, magnetic materials, structures and surfaces, spectroscopy, and crystallography, which are generating high-throughput data that may accelerate the discovery of new medicines.</p>
<p>Moreover, they are committed to open data standards and all 3D structures of human proteins that are being elucidated are published for data analyst to test potential novel therapies in-silico. Cancer related proteins including human protein kinases, metabolism-associated proteins, integratl membrane proteins and proteins associated with epigenetics are the focus of the team and more information can be found in their <a href="https://www.diamond.ac.uk/Instruments/Mx/Fragment-Screening/XChem-team.html">website</a>.</p>
<p><b><a href="https://rd.springer.com/article/10.1007/s40262-018-0659-0">Pharmacokinetic–Pharmacodynamic Modeling in Pediatric Drug Development, and the Importance of Standardized Scaling of Clearance</a></b></p>
<p>Since modelling can be used readily to extrapolate results in adults to children hereby avoiding clinical trials in children there is a huge interest by all stakeholders to clarify when that is an appropriate practice. In principle, extrapolation should be done whenever is reasonable to assume that children in comparison to adults have a similar disease progression, response to intervention, exposure-response. However, if the exposure-response is dissimilar but there is a PD measurement that can be used to predict efficacy in children it would is still possible to conduct partial extrapolation. The decision tree below summarises the idea: </p>
<p><br />
</p>
<p><img src="https://blogs.warwick.ac.uk/images/ctraynor/2018/09/24/decisiontreepedi.png?maxWidth=500" alt="Decision tree pediatrics (E. Germovsek et al.)" border="0" /></p>
<h2 class="ArticleTitle" lang="en"></h2>
<p>The history of paediatrics did not start taking into consideration the complex maturation that occurs in human beings. Instead, early in time dose was simply scaled down linearly with weight. This wrong practice lead to the occurrence of serious adverse event such as the <a href="https://en.wikipedia.org/wiki/Gray_baby_syndrome" style="font-family: AdvPTimes, serif; font-size: 10pt; letter-spacing: 0.008em;">gray baby syndrome</a> and<a href="https://en.wikipedia.org/wiki/Kernicterus" style="font-family: AdvPTimes, serif; font-size: 10pt; letter-spacing: 0.008em;"> kernicterus</a>. One of the first achievements in modelling the dose in paediatrics was the use of Body Surface Area, Crawford et al., which improves dramatically the efficacy and safety profile. More recently a combination of allometric weight scaling with a sigmoidal function has been proposed to describe the changes in Cl due to age and weight:</p>
<p><img src="https://blogs.warwick.ac.uk/images/ctraynor/2018/09/24/clearance.png?maxWidth=500" alt="cl" border="0" /><br />
</p>
<p>On the other hand, for extrapolation, we are instead aiming for the use of modelling techniques that comprises individual variation. A prominent example of that is Non-Linear-Mixed Effect Modelling (NLME) where all the study data are fitted simultaneously in one model, but the PK parameters may vary between individuals (VBI). This approach has become standard practice because it provides unbiased estimates through simultaneous estimation of parameter-level interindividual variability and observation level residual variability.</p>AntimicrobialQspTue, 25 Sep 2018 12:59:03 GMTCarlos Serra Traynorhttps://blogs.warwick.ac.uk/ctraynor/entry/qsp-uk_network_satellite/#comments8a1785816541e9760165d2b81a1d00ab0Data Challenge by Carlos Serra Traynor
https://blogs.warwick.ac.uk/ctraynor/entry/data_challenge/
<p class="answer">Writing about web page <a href="https://www.kaggle.com/competitions" title="Related external link: https://www.kaggle.com/competitions">https://www.kaggle.com/competitions</a></p>
<p>Today is the day to start a Data Challenge, during 5 days we are going to go through an Introduction to Data Science in Python, Regression Challenge in R and finally an introduction to Matlab on Friday. I hope this will be very enjoyable! </p>
<p><b>Challenge in Python : Data cleaning</b></p>
<p>This is presented as an introduction to Python the first challenge is to explore the dataset, it is fairly easy, however as you know, easier things are the best to learn and understand, of course, Leonardo Da Vinci before painting La Mona Lisa needed first to learn to draw. </p>
<p>The dataset that I have chosen is the Adverse events dataset, many other are available in Kaggle, for example here: https://www.kaggle.com/rtatman/fun-beginner-friendly-datasets/</p>
<p>The solution consists of loading the data, and use describe().</p>
<p>We need to note that describe() only works on continuous variables if we instead are interested in, for example, categorical variables we can use count().</p>
<p>Here you can find my solution!</p>
<p><a href="https://github.com/csetraynor/PythonChallenge.git">https://github.com/csetraynor/PythonChallenge</a></p>
<p><b>Challenge in R: Regression modelling</b></p>
<p>Regression is the model of output variables (y) from input variables (x) . There are many different ways to model regression and an important kind of regression are so called "generalised linear models".</p>
<p>Three kinds of regression are:</p>
<p>-Linear: Prediction of a continous variable.</p>
<p>-Logistic> Prediction of a categorical variable, for example a binary output 0, 1.</p>
<p>-Poisson: Prediction of a count variable.</p>
<p><b>Github repo</b></p>
<p>This is a link to the progress line of this challenge, where I will upload all the problems for this challenge.</p>
<p><a href="https://github.com/csetraynor/DataChallenge.git">https://github.com/csetraynor/DataChallenge</a></p>DataLearningMachineRegressionScienceThu, 02 Aug 2018 22:24:15 GMTCarlos Serra Traynorhttps://blogs.warwick.ac.uk/ctraynor/entry/data_challenge/#comments8a1784e56002102b01602204979901030