python机器学习-乳腺癌细胞挖掘(博主亲自录制视频)
机器学习,统计分项目联系:QQ:231469242
数据
https://github.com/thomas-haslwanter/statsintro_python/blob/master/ISP/Code_Quantlets/14_Bayesian/bayesianStats/challenger_data.csv
https://github.com/thomas-haslwanter/statsintro_python/tree/master/ISP/Code_Quantlets/14_Bayesian/bayesianStats
因其右侧固体火箭助推器(SRB)的O型环密封圈失效,毗邻的外部燃料舱在泄漏出的火焰的高温烧灼下结构失效,使高速飞行中的航天飞机在空气阻力的作用下于发射后的第73秒解体 华氏31度等于摄氏-0.5555555555555556,因为温度过低造成挑战者号失事。 天气预报称28日的清晨将会非常寒冷,气温接近华氏31度(摄氏-0.5度),这是允许发射的最低温度。过低的温度让莫顿·塞奥科公司的工程师感到担心,该公司是制造与维护航天飞机SRB部件的承包商。在27日晚间的一次远程会议上,塞奥科公司的工程师和管理层同来自肯尼迪航天中心和马歇尔航天飞行中心的NASA管理层讨论了天气问题。部分工程师,如比较著名的罗杰·博伊斯乔利,再次表达了他们对密封SRB部件接缝处的O型环的担心:即,低温会导致O型环的橡胶材料失去弹性。他们认为,如果O型环的温度低于华氏53度(约摄氏11.7度),将无法保证它能有效密封住接缝。他们也提出,发射前一天夜间的低温,几乎肯定把SRB的温度降到华氏40度的警戒温度以下。但是,莫顿·塞奥科公司的管理层否决了他们的异议,他们认为发射进程能按日程进行。 由于低温,航天飞机旁矗立的定点通信建筑被大量冰雪覆盖。肯尼迪冰雪小组在红外摄像机中发现,右侧SRB部件尾部接缝处的温度仅有华氏8度(摄氏-13度):从液氧舱通风口吹来的极冷空气降低了接缝处的温度,让该处的温度远低于气温,并远低于O
# -*- coding: utf-8 -*- ''' 因其右侧固体火箭助推器(SRB)的O型环密封圈失效,毗邻的外部燃料舱在泄漏出的火焰的高温烧灼下结构失效,使高速飞行中的航天飞机在空气阻力的作用下于发射后的第73秒解体 华氏31度等于摄氏-0.5555555555555556,因为温度过低造成挑战者号失事。 天气预报称28日的清晨将会非常寒冷,气温接近华氏31度(摄氏-0.5度),这是允许发射的最低温度。过低的温度让莫顿·塞奥科公司的工程师感到担心,该公司是制造与维护航天飞机SRB部件的承包商。在27日晚间的一次远程会议上,塞奥科公司的工程师和管理层同来自肯尼迪航天中心和马歇尔航天飞行中心的NASA管理层讨论了天气问题。部分工程师,如比较著名的罗杰·博伊斯乔利,再次表达了他们对密封SRB部件接缝处的O型环的担心:即,低温会导致O型环的橡胶材料失去弹性。他们认为,如果O型环的温度低于华氏53度(约摄氏11.7度),将无法保证它能有效密封住接缝。他们也提出,发射前一天夜间的低温,几乎肯定把SRB的温度降到华氏40度的警戒温度以下。但是,莫顿·塞奥科公司的管理层否决了他们的异议,他们认为发射进程能按日程进行。 由于低温,航天飞机旁矗立的定点通信建筑被大量冰雪覆盖。肯尼迪冰雪小组在红外摄像机中发现,右侧SRB部件尾部接缝处的温度仅有华氏8度(摄氏-13度):从液氧舱通风口吹来的极冷空气降低了接缝处的温度,让该处的温度远低于气温,并远低于O形环的设计承限温度。但这个信息从未传达给决策层。 On the day of the Challenger disaster, the outside temperature was 31 ıF. The posterior distribution of a defect occurring, given this temperature, almost guaranteed that the Challenger was going to be subject to defective O-rings Example of PyMC - The Challenger Disaster This example uses Bayesian methods to find the mean and the 95% confidence intervals for the likelihood of an O-ring failure O型密封圈失效 in a space shuttle, as a function of the ambient temperature周围温度. Input data are the recorded O-ring performances of the space shuttles before 1986. ''' # Copyright(c) 2015, Thomas Haslwanter. All rights reserved, under the CC BY-SA 4.0 International License # Import standard packages import numpy as np import matplotlib.pyplot as plt from scipy import stats import pandas as pd import seaborn as sns import os # additional packages import pymc as pm from scipy.stats.mstats import mquantiles sns.set_context('poster') def logistic(x, beta, alpha=0): '''Logistic Function''' return 1.0 / (1.0 + np.exp(np.dot(beta, x) + alpha)) def getData(): '''Get and show the O-ring data''' inFile = 'challenger_data.csv' challenger_data = np.genfromtxt(inFile, skip_header=1, usecols=[1, 2], missing_values='NA', delimiter=',') # drop the NA values challenger_data = challenger_data[~np.isnan(challenger_data[:, 1])] temperature = challenger_data[:, 0] failureData = challenger_data[:, 1] # defect or not? return (temperature, failureData) def showAndSave(temperature, failures): '''Shows the input data, and saves the resulting figure''' # Plot it, as a function of tempature plt.figure() setFonts() sns.set_style('darkgrid') np.set_printoptions(precision=3, suppress=True) plt.scatter(temperature, failures, s=200, color="k", alpha=0.5) plt.yticks([0, 1]) plt.ylabel("Damage Incident?") plt.xlabel("Outside Temperature [F]") plt.title("Defects of the Space Shuttle O-Rings vs temperature") plt.tight_layout outFile = 'Challenger_ORings.png' showData(outFile) def mcmcSimulations(temperature, failures): '''Perform the MCMC-simulations''' # Define the prior distributions for alpha and beta # 'value' sets the start parameter for the simulation # The second parameter for the normal distributions is the "precision", # i.e. the inverse of the standard deviation np.random.seed(1234) beta = pm.Normal("beta", 0, 0.001, value=0) alpha = pm.Normal("alpha", 0, 0.001, value=0) # Define the model-function for the temperature @pm.deterministic def p(t=temperature, alpha=alpha, beta=beta): return 1.0 / (1. + np.exp(beta * t + alpha)) # connect the probabilities in `p` with our observations through a # Bernoulli random variable. observed = pm.Bernoulli("bernoulli_obs", p, value=failures, observed=True) # Combine the values to a model model = pm.Model([observed, beta, alpha]) # Perform the simulations map_ = pm.MAP(model) map_.fit() mcmc = pm.MCMC(model) mcmc.sample(120000, 100000, 2) # --- Show the resulting posterior distributions --- alpha_samples = mcmc.trace('alpha')[:, None] # best to make them 1d beta_samples = mcmc.trace('beta')[:, None] return(alpha_samples, beta_samples) def showSimResults(alpha_samples, beta_samples): '''Show the results of the simulations, and save them to an outFile''' plt.figure(figsize=(12.5, 6)) sns.set_style('darkgrid') setFonts(18) # Histogram of the samples: plt.subplot(211) plt.title(r"Posterior distributions of the variables $alpha, eta$") plt.hist(beta_samples, histtype='stepfilled', bins=35, alpha=0.85, label=r"posterior of $eta$", color="#7A68A6", normed=True) plt.legend() plt.subplot(212) plt.hist(alpha_samples, histtype='stepfilled', bins=35, alpha=0.85, label=r"posterior of $alpha$", color="#A60628", normed=True) plt.legend() outFile = 'Challenger_Parameters.png' showData(outFile) def calculateProbability(alpha_samples, beta_samples, temperature, failures): '''Calculate the mean probability, and the CIs''' # Calculate the probability as a function of time t = np.linspace(temperature.min() - 5, temperature.max() + 5, 50)[:, None] p_t = logistic(t.T, beta_samples, alpha_samples) mean_prob_t = p_t.mean(axis=0) # --- Calculate CIs --- # vectorized bottom and top 2.5% quantiles for "confidence interval" quantiles = mquantiles(p_t, [0.025, 0.975], axis=0) return (t, mean_prob_t, p_t, quantiles) def showProbabilities(linearTemperature, temperature, failures, mean_prob_t, p_t, quantiles): '''Show the posterior probabilities, and save the resulting figures''' # --- Show the probability curve ---- plt.figure(figsize=(12.5, 4)) setFonts(18) plt.plot(linearTemperature, mean_prob_t, lw=3, label="Average posterior probability of defect") plt.plot(linearTemperature, p_t[0, :], ls="--", label="Realization from posterior") plt.plot(linearTemperature, p_t[-2, :], ls="--", label="Realization from posterior") plt.scatter(temperature, failures, color="k", s=50, alpha=0.5) plt.title("Posterior expected value of probability of defect, plus realizations") plt.legend(loc="lower left") plt.ylim(-0.1, 1.1) plt.xlim(linearTemperature.min(), linearTemperature.max()) plt.ylabel("Probability") plt.xlabel("Temperature [F]") outFile = 'Challenger_Probability.png' showData(outFile) # --- Draw CIs --- setFonts() sns.set_style('darkgrid') plt.fill_between(linearTemperature[:, 0], *quantiles, alpha=0.7, color="#7A68A6") plt.plot(linearTemperature[:, 0], quantiles[0], label="95% CI", color="#7A68A6", alpha=0.7) plt.plot(linearTemperature, mean_prob_t, lw=1, ls="--", color="k", label="average posterior probability of defect") plt.xlim(linearTemperature.min(), linearTemperature.max()) plt.ylim(-0.02, 1.02) plt.legend(loc="lower left") plt.scatter(temperature, failures, color="k", s=50, alpha=0.5) plt.xlabel("Temperature [F]") plt.ylabel("Posterior Probability Estimate") outFile = 'Challenger_CIs.png' showData(outFile) if __name__=='__main__': (temperature, failures) = getData() showAndSave(temperature, failures) (alpha, beta) = mcmcSimulations(temperature, failures) showSimResults(alpha, beta) (linearTemperature, mean_p, p, quantiles) = calculateProbability(alpha, beta, temperature, failures) showProbabilities(linearTemperature, temperature, failures, mean_p, p, quantiles)
https://study.163.com/provider/400000000398149/index.htm?share=2&shareId=400000000398149( 欢迎关注博主主页,学习python视频资源,还有大量免费python经典文章)