Unsupervised Learning and NGS¶
This lab will walk you through some examples of unsupervised learning in NGS.
In [1]:
library(golubEsets)
data(Golub_Merge)
Loading required package: Biobase
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: ‘BiocGenerics’
The following objects are masked from ‘package:parallel’:
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from ‘package:stats’:
IQR, mad, xtabs
The following objects are masked from ‘package:base’:
anyDuplicated, append, as.data.frame, as.vector, cbind, colnames,
do.call, duplicated, eval, evalq, Filter, Find, get, grep, grepl,
intersect, is.unsorted, lapply, lengths, Map, mapply, match, mget,
order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank,
rbind, Reduce, rownames, sapply, setdiff, sort, table, tapply,
union, unique, unlist, unsplit
Welcome to Bioconductor
Vignettes contain introductory material; view with
'browseVignettes()'. To cite Bioconductor, see
'citation("Biobase")', and for packages 'citation("pkgname")'.
MDS/PCA¶
We have loaded the Golub_Merge data set that was used in this morning’s lecture. It is a pretty complex data type used by Bioconductor.
In [2]:
class(Golub_Merge)
head(Golub_Merge)
summary(Golub_Merge)
Out[2]:
'ExpressionSet'
Out[2]:
ExpressionSet (storageMode: lockedEnvironment)
assayData: 1 features, 72 samples
element names: exprs
protocolData: none
phenoData
sampleNames: 39 40 ... 33 (72 total)
varLabels: Samples ALL.AML ... Source (11 total)
varMetadata: labelDescription
featureData: none
experimentData: use 'experimentData(object)'
pubMedIds: 10521349
Annotation: hu6800
Out[2]:
Length Class Mode
1 ExpressionSet S4
In [3]:
#Fortunately, bioconductor has a function to extract the expression data:
Golub.expression.data=exprs(Golub_Merge)
head(Golub.expression.data)
dim(Golub.expression.data)
Out[3]:
39 | 40 | 42 | 47 | 48 | 49 | 41 | 43 | 44 | 45 | 35 | 36 | 37 | 38 | 28 | 29 | 30 | 31 | 32 | 33 | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
AFFX-BioB-5_at | -342 | -87 | 22 | -243 | -130 | -256 | -62 | 86 | -146 | -187 | ⋯ | 7 | -213 | -25 | -72 | -4 | 15 | -318 | -32 | -124 | -135 |
AFFX-BioB-M_at | -200 | -248 | -153 | -218 | -177 | -249 | -23 | -36 | -74 | -187 | ⋯ | -100 | -252 | -20 | -139 | -116 | -114 | -192 | -49 | -79 | -186 |
AFFX-BioB-3_at | 41 | 262 | 17 | -163 | -28 | -410 | -7 | -141 | 170 | 312 | ⋯ | -57 | 136 | 124 | -1 | -125 | 2 | -95 | 49 | -37 | -70 |
AFFX-BioC-5_at | 328 | 295 | 276 | 182 | 266 | 24 | 142 | 252 | 174 | 142 | ⋯ | 132 | 318 | 325 | 392 | 241 | 193 | 312 | 230 | 330 | 337 |
AFFX-BioC-3_at | -224 | -226 | -211 | -289 | -170 | -535 | -233 | -201 | -32 | 114 | ⋯ | -377 | -209 | -396 | -324 | -191 | -51 | -139 | -367 | -188 | -407 |
AFFX-BioDn-5_at | -427 | -493 | -250 | -268 | -326 | -810 | -284 | -384 | -318 | -148 | ⋯ | -478 | -557 | -464 | -510 | -411 | -155 | -344 | -508 | -423 | -566 |
Out[3]:
- 7129
- 72
In [4]:
# Now, we compute the distance matrix:
Golub.distance.matrix=dist(t(Golub.expression.data))
# We used 't' to transpose the matrix. This is just because dist works on rows and we want distance on columns.
as.matrix(Golub.distance.matrix)
dim(as.matrix(Golub.distance.matrix))
Out[4]:
39 | 40 | 42 | 47 | 48 | 49 | 41 | 43 | 44 | 45 | 35 | 36 | 37 | 38 | 28 | 29 | 30 | 31 | 32 | 33 | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
39 | 0 | 101530.75134165 | 94405.0427042963 | 115681.471662492 | 116666.762777579 | 93189.9424777159 | 101409.857528743 | 117607.803048947 | 86952.1135453302 | 97854.9785856601 | ⋯ | 117779.650470699 | 136031.239191592 | 128620.550613034 | 119218.084769048 | 109230.616353658 | 118539.701636203 | 129537.9602356 | 105918.973493893 | 108391.366247501 | 121993.463816714 |
40 | 101530.75134165 | 0 | 89502.2884288441 | 98459.6550065051 | 92980.4060810664 | 79182.927964303 | 81381.9940773142 | 109351.494530253 | 87252.808401793 | 88662.6601901838 | ⋯ | 91413.8671592007 | 129683.258726792 | 114292.294867152 | 107131.461336061 | 90504.1050063476 | 90631.1349923414 | 109795.078560016 | 84080.5244274796 | 88449.7392703902 | 100042.754050456 |
42 | 94405.0427042963 | 89502.2884288441 | 0 | 106954.738258761 | 88800.7424462206 | 91730.070876458 | 91400.2229975398 | 102774.803011244 | 90296.8667451978 | 90580.0811105841 | ⋯ | 113191.21683682 | 120374.432841862 | 97251.0110692943 | 88294.7444189064 | 85095.2870316565 | 90383.4030339641 | 106316.88261043 | 96237.0595560775 | 94266.5720974302 | 95023.7579976713 |
47 | 115681.471662492 | 98459.6550065051 | 106954.738258761 | 0 | 115169.014396234 | 109182.775431842 | 93290.2490617321 | 105747.545446691 | 97237.7721258565 | 97784.6800270881 | ⋯ | 76272.8337286612 | 138114.10801942 | 125885.089331501 | 121789.679324646 | 110847.203270989 | 98573.1509286378 | 130349.399553661 | 103235.654112327 | 111531.040078536 | 127448.219006779 |
48 | 116666.762777579 | 92980.4060810664 | 88800.7424462206 | 115169.014396234 | 0 | 96797.5663020512 | 92101.8278265964 | 124344.701563838 | 98684.8840248597 | 88248.9366564833 | ⋯ | 124334.479690873 | 129861.00468963 | 112328.335663803 | 102728.6884322 | 82271.0042967752 | 99771.3776841835 | 99908.6183619812 | 94718.8465776479 | 90727.6477155668 | 104320.326796842 |
49 | 93189.9424777159 | 79182.927964303 | 91730.070876458 | 109182.775431842 | 96797.5663020512 | 0 | 83416.5873852437 | 115825.703567904 | 89241.8910545939 | 92708.7901819455 | ⋯ | 104864.125047606 | 125927.362034627 | 112632.708113585 | 105694.076082816 | 90800.8308772558 | 103478.898269164 | 113053.022476182 | 83822.0093352575 | 85829.852883481 | 100606.532277979 |
41 | 101409.857528743 | 81381.9940773142 | 91400.2229975398 | 93290.2490617321 | 92101.8278265964 | 83416.5873852437 | 0 | 96494.1568904563 | 76495.7141805997 | 77813.4478608936 | ⋯ | 97652.2335689256 | 133363.211951422 | 111403.195932612 | 99165.2819186231 | 90331.993064473 | 77039.4635235734 | 112335.677760897 | 79434.8565933117 | 91030.549064586 | 110556.010537646 |
43 | 117607.803048947 | 109351.494530253 | 102774.803011244 | 105747.545446691 | 124344.701563838 | 115825.703567904 | 96494.1568904563 | 0 | 107846.317442924 | 114440.595900231 | ⋯ | 110525.995272605 | 144945.732224167 | 130253.294829728 | 117571.748715412 | 122765.185867167 | 104096.191227153 | 139022.393634982 | 115532.774510093 | 119322.482630056 | 120064.588101571 |
44 | 86952.1135453302 | 87252.808401793 | 90296.8667451978 | 97237.7721258565 | 98684.8840248597 | 89241.8910545939 | 76495.7141805997 | 107846.317442924 | 0 | 73557.527405426 | ⋯ | 101880.854045301 | 130697.111636027 | 119685.714703134 | 112801.238769794 | 99492.9307790257 | 94102.1585831058 | 119220.509603004 | 87250.9331984478 | 100756.936431196 | 117482.216407421 |
45 | 97854.9785856601 | 88662.6601901838 | 90580.0811105841 | 97784.6800270881 | 88248.9366564833 | 92708.7901819455 | 77813.4478608936 | 114440.595900231 | 73557.527405426 | 0 | ⋯ | 103923.933658229 | 131771.386780287 | 112896.807040766 | 102245.018431217 | 82643.2769497919 | 82259.8661802947 | 106670.772388691 | 89961.29273193 | 91779.1921025676 | 109075.024666511 |
46 | 92051.8848367593 | 89667.3888322839 | 90853.5671066359 | 93137.5871708088 | 103778.045775588 | 97774.9851393494 | 86164.8348225655 | 108309.93454896 | 60701.5280697282 | 67341.8403668923 | ⋯ | 91772.1338370205 | 131851.136240838 | 122629.851618601 | 113902.082184655 | 98956.6898142819 | 95964.4094443351 | 119740.404609305 | 94535.5741929989 | 100760.758279203 | 117334.827498062 |
70 | 94474.1292471119 | 90543.0415051317 | 98962.5288530967 | 84884.5253329486 | 106380.763420837 | 82446.6710061722 | 71787.8693721997 | 102140.092152886 | 75369.6851990242 | 73550.3206043318 | ⋯ | 78297.3535184938 | 131345.050675692 | 116642.178713363 | 105495.372443534 | 93138.8902124134 | 88338.7688447151 | 117137.052622131 | 81481.3470004025 | 89789.6259486585 | 115285.709014604 |
71 | 91210.449231434 | 75800.9960026912 | 76012.2278255282 | 102929.632880915 | 85124.8725579075 | 87899.0488458209 | 78620.6871949107 | 111293.897555077 | 84107.7911194914 | 77705.6944052365 | ⋯ | 104607.345956199 | 121683.652973602 | 102919.912023864 | 92218.7278756327 | 74663.2411833293 | 83278.0234095406 | 101235.998088625 | 82823.860807137 | 78003.7865747555 | 94976.6624966365 |
72 | 85805.5319370494 | 75909.4173538435 | 72368.3974328574 | 85522.1064462283 | 78803.3894068015 | 83746.5995846996 | 73045.5381320447 | 106398.918016115 | 75438.3717016745 | 75249.0893034593 | ⋯ | 87490.7528828047 | 118659.364244884 | 104653.921957087 | 94655.3118266482 | 75133.7961905293 | 83681.2539401747 | 99689.5591975408 | 81692.1548558979 | 77858.9929937448 | 98954.6254654122 |
68 | 113371.006527242 | 94525.0913091334 | 86831.7762745874 | 106406.654307896 | 72720.88323171 | 93627.3702610513 | 79041.1383837556 | 118776.311278807 | 92909.6209926615 | 81663.7217924826 | ⋯ | 114034.96953128 | 116602.289973225 | 106996.968400979 | 105919.584190083 | 78062.971785604 | 91675.3071497445 | 88044.3968290998 | 82559.1503166063 | 83379.4356541228 | 101970.751968395 |
69 | 114328.27075575 | 93648.6553347137 | 81837.8932280151 | 116631.871390285 | 60917.9287976865 | 95780.4013825375 | 86820.861133716 | 120450.05646325 | 95714.8069005 | 82387.9940646694 | ⋯ | 124820.357854799 | 120591.500098473 | 106033.822424734 | 100949.554778612 | 75680.0174616787 | 94248.5113728594 | 93849.6951406876 | 89553.6454087716 | 83392.7434432997 | 95354.1322125056 |
67 | 104923.776261627 | 95253.6643337147 | 85413.6313418415 | 97688.8063341957 | 101947.613449261 | 95155.9652885724 | 90714.1880468541 | 113487.211204611 | 94864.5158476024 | 95265.8546017407 | ⋯ | 103465.554616017 | 114463.166228268 | 108825.966519944 | 103737.016464712 | 87475.6447589842 | 95459.0105490309 | 103849.957818961 | 88200.1779419974 | 83420.1260368264 | 98790.3822191209 |
55 | 111413.432713475 | 97525.7608275885 | 106238.206983175 | 86992.5883107291 | 120793.825214702 | 99316.8364880799 | 70971.4240169943 | 86383.7981510422 | 90214.1008102392 | 95888.6012255888 | ⋯ | 82203.8903094008 | 136278.633640788 | 126178.725120363 | 114127.608566026 | 109515.782844301 | 90223.3165151891 | 127565.108352559 | 89019.2765472737 | 108157.201692721 | 120731.36322845 |
56 | 108328.442414723 | 98873.494466414 | 113378.453199892 | 83151.6434594049 | 125261.308076357 | 105717.137007204 | 93574.8552336577 | 94646.2567775398 | 94726.3140209731 | 101458.691130923 | ⋯ | 68800.5136972101 | 146430.36153749 | 139515.331716625 | 129760.954235086 | 118854.082416213 | 107286.030488596 | 139303.208017619 | 107856.229500201 | 115581.606456218 | 130999.417517789 |
59 | 131576.809753847 | 129513.068541364 | 132666.329650744 | 110509.423521255 | 147825.245915574 | 138870.961874684 | 131165.931007255 | 111627.811986978 | 117380.340002063 | 129771.774007293 | ⋯ | 111820.59058152 | 156134.184171821 | 161512.470227534 | 161294.097706643 | 147646.977646005 | 138725.27630897 | 157790.703880172 | 137376.70195852 | 143520.3871929 | 154930.491379199 |
52 | 146292.3142137 | 114630.412605033 | 124510.074323325 | 137416.976200177 | 118992.801652033 | 122808.208646654 | 108944.69472168 | 137810.530620849 | 127975.314986915 | 118160.595157607 | ⋯ | 137413.985951212 | 122719.62710178 | 111474.381200346 | 120673.798291095 | 115007.101993746 | 101321.694123223 | 102290.66766817 | 103982.128281739 | 109921.549356803 | 114675.426835918 |
53 | 125260.795750306 | 107918.301617474 | 104758.98736624 | 123460.821874796 | 110098.627543671 | 105379.23237052 | 110035.813915289 | 133547.878770125 | 119231.604300202 | 113690.045263427 | ⋯ | 119931.670942249 | 102833.580867341 | 84030.4486242933 | 101116.091914195 | 87274.2211136828 | 95270.3538777935 | 83002.4742884211 | 91028.0997604586 | 89059.1689777083 | 90762.7430061476 |
51 | 119563.021532579 | 100986.85828364 | 108365.665918685 | 92753.5421210425 | 122200.837415298 | 106407.598962668 | 96716.9950318971 | 113059.110990667 | 106451.981954307 | 100396.417316556 | ⋯ | 79249.2852144421 | 116907.729616138 | 100996.686668425 | 98948.6528154881 | 90167.9525164013 | 77172.8046658925 | 107972.897784583 | 82243.3193396278 | 91287.3718046478 | 98840.1926647252 |
50 | 111571.333598734 | 91490.7270109928 | 96750.9254167628 | 117469.567373852 | 96909.7695281544 | 97395.1085424725 | 95800.3486580294 | 128645.179128485 | 98890.5531231371 | 91703.2451715859 | ⋯ | 111109.114216611 | 92347.2801494446 | 78498.2996185268 | 90208.9753572226 | 63298.6299377799 | 83625.7206187187 | 70314.7841566765 | 65831.9121779096 | 63570.7325268476 | 77662.2533216749 |
54 | 121443.978562134 | 105489.585670814 | 110185.626381121 | 106679.414766861 | 115078.243538907 | 104694.663340592 | 89456.525066649 | 115485.592629557 | 104769.525803069 | 109149.760984621 | ⋯ | 119726.623246461 | 122526.566307067 | 110404.763099243 | 117380.510405263 | 108548.030557906 | 94719.3592619798 | 111749.681122588 | 86262.359178265 | 106376.698454126 | 115328.910482151 |
57 | 123630.179992589 | 103912.49746301 | 112490.833239869 | 109090.610086295 | 127041.121067157 | 109797.540855886 | 84165.9510788062 | 112265.522975667 | 107564.128188723 | 104052.858274052 | ⋯ | 110889.012981449 | 137595.572657699 | 118254.61040061 | 109267.92150032 | 112096.122172 | 78006.4429582583 | 125060.530900041 | 93531.2583845636 | 113456.948557592 | 118941.324126647 |
58 | 130139.752835173 | 136763.453846413 | 138157.175188262 | 113752.452439497 | 161925.149488892 | 137183.42650991 | 141270.489749275 | 128731.630052602 | 125774.597641972 | 140812.557280947 | ⋯ | 108790.033969109 | 141391.508889325 | 155024.720170688 | 160183.709564987 | 147040.885460473 | 144644.655303955 | 150611.714411595 | 126887.142382512 | 136219.897214027 | 145257.842769332 |
60 | 147424.874142731 | 133248.156647662 | 138021.465982651 | 121755.067820604 | 148109.369490927 | 136291.791620039 | 122187.428682332 | 111913.554657155 | 127028.786241544 | 132320.025151146 | ⋯ | 123754.202130675 | 151227.879744444 | 144742.549936085 | 144116.166692013 | 139599.843269969 | 117028.569187186 | 144057.702820779 | 119207.707301164 | 135742.819990598 | 140293.53858963 |
61 | 101303.33252169 | 99757.2813182075 | 102612.766681344 | 99438.4519388752 | 119574.990098264 | 96988.6012632412 | 99894.5837971209 | 119739.586186023 | 101478.537622494 | 102277.115338672 | ⋯ | 87634.902413365 | 122285.724841455 | 118976.683270295 | 113977.460175247 | 99096.3693633627 | 107588.439109414 | 111973.781645526 | 86358.4214191065 | 94630.516626509 | 108523.205527666 |
65 | 102585.496440774 | 83294.3259772237 | 89829.0354840794 | 108657.019221033 | 96223.0160356658 | 86370.0686291264 | 74878.0787947981 | 112490.577143155 | 92767.6807622137 | 81078.7661351602 | ⋯ | 107010.646764703 | 109499.227549787 | 92998.2419511251 | 88360.1463104266 | 73962.7612180616 | 72852.7950596269 | 91265.4742167048 | 71065.6733592246 | 78902.049067182 | 88887.5228926985 |
⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋱ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | |
9 | 139013.57634778 | 123645.035472517 | 118097.479020511 | 143624.608841939 | 98682.762142129 | 112345.758531419 | 108407.497651223 | 150710.661105975 | 123176.488247555 | 120222.489747135 | ⋯ | 146699.292227331 | 136910.169509062 | 119976.1176193 | 113378.212611595 | 99862.6473712769 | 114983.06039152 | 112489.181057558 | 103231.297439294 | 102728.343002309 | 117727.734425665 |
10 | 114732.524590894 | 102559.629791649 | 98142.2834816879 | 118659.657327164 | 97300.3468236368 | 96826.1225496508 | 82671.5580898292 | 123845.505671381 | 99354.7592267225 | 96318.7495194991 | ⋯ | 119183.931081333 | 129445.675319804 | 110752.576344751 | 95866.2086086646 | 89345.5496933115 | 92008.3159774159 | 106150.652767658 | 84555.5449571464 | 93449.8417334133 | 107902.519307938 |
11 | 121155.117721044 | 98899.7280178262 | 97775.9916646208 | 113198.678746706 | 91682.7385880243 | 95858.3700727276 | 82409.3088916537 | 125961.360710338 | 99511.6022280819 | 95037.5594067945 | ⋯ | 108658.080468044 | 131003.997091692 | 109379.759718149 | 91299.4691988951 | 83791.1419960368 | 86854.1962947099 | 105866.321112996 | 82347.4710844237 | 87292.0091703702 | 105676.950571068 |
12 | 96022.3048515292 | 72476.2884466361 | 83746.6804058525 | 97316.5916121193 | 95422.6260799817 | 82230.9479454056 | 67067.0148359087 | 103177.68292126 | 76610.1441259576 | 70512.4869863487 | ⋯ | 93769.594688257 | 120983.803374667 | 99564.7586397918 | 91220.0626013817 | 80721.8185127169 | 72340.6742089124 | 104126.751649132 | 70954.4871237894 | 83955.9685192185 | 95691.0556374001 |
13 | 95335.6838702067 | 76779.5337378393 | 72302.1535433074 | 98117.3633308601 | 54920.1584848405 | 85930.6319539197 | 79471.3575258407 | 113908.067383307 | 79588.3275172936 | 79768.78317362 | ⋯ | 109297.781477942 | 120437.389439493 | 110355.858698123 | 106036.111749724 | 80702.6131671088 | 94790.9757149909 | 99387.802445773 | 86223.0747827981 | 86036.6400320236 | 101502.044605023 |
14 | 122539.062857523 | 86400.319108207 | 96436.0401613422 | 104820.952848178 | 85607.4026530416 | 89686.83212713 | 84886.1335672676 | 121679.451732821 | 98642.5148402047 | 90796.5119043678 | ⋯ | 102658.203992667 | 125305.50865784 | 106099.912902886 | 96654.025875801 | 79983.9575477483 | 88583.7985638457 | 101097.778343542 | 81528.4719469217 | 78067.5907211181 | 99656.2550169331 |
15 | 112437.512934963 | 91468.7203419836 | 93265.2567733559 | 119314.073792659 | 62914.0169914463 | 88941.4769047602 | 74370.1613350408 | 125023.732287114 | 90151.4587291853 | 88724.6068630343 | ⋯ | 127660.461647293 | 126591.578546916 | 105914.150711791 | 104033.874199705 | 81236.6600495121 | 93884.1422445772 | 99821.0206770097 | 87236.9430402052 | 84792.7639129661 | 106186.268241237 |
16 | 104436.345895478 | 78686.0457387458 | 89565.0864008962 | 97353.7868600909 | 72216.5637579081 | 74486.5197401516 | 62357.9159048793 | 110010.660101647 | 85278.2539338136 | 80303.699827592 | ⋯ | 100019.98634773 | 129990.448695279 | 107276.627734097 | 94293.0507619729 | 80907.822236172 | 82033.1470687307 | 102852.531111295 | 77027.5740627991 | 78746.5958692819 | 102714.005335202 |
17 | 133954.967388298 | 110465.27246153 | 109687.208762007 | 122182.118495302 | 89196.5011141132 | 105574.175417097 | 110887.678531025 | 141384.719634054 | 119712.104517463 | 111443.533755889 | ⋯ | 126160.479814401 | 127940.828072981 | 119079.098056712 | 114987.655328735 | 91755.0827093519 | 114039.075382081 | 107114.109556118 | 101300.275843652 | 90414.9356301269 | 110954.542971435 |
18 | 101119.276485742 | 82748.2651238079 | 92728.4260353857 | 105687.623622636 | 98133.8673241812 | 89509.9928723045 | 76131.1428720205 | 117148.768175342 | 79813.6033706034 | 87630.3912863568 | ⋯ | 115164.659483715 | 124361.384629635 | 103462.47635254 | 103085.235698426 | 89821.7126757222 | 87948.3983992887 | 108145.659159302 | 85409.1055625804 | 95232.4906006348 | 105375.902012747 |
19 | 100086.195321832 | 64888.8592286842 | 79378.5656962886 | 88448.291922456 | 76186.5834186046 | 80035.8509107013 | 66720.9469507141 | 100193.300165231 | 82723.1682662119 | 79327.0694277811 | ⋯ | 97163.7244603149 | 131099.77723856 | 103331.218564382 | 91342.0503382752 | 79793.841786193 | 79775.7938725777 | 106041.239855068 | 82937.3448333138 | 84740.384062146 | 98734.3685349737 |
20 | 151359.742871082 | 136915.485639865 | 126484.84321056 | 153195.375041155 | 86549.3382470369 | 132540.436663684 | 127927.577019187 | 159623.857627862 | 140017.593751643 | 131565.102683044 | ⋯ | 164602.736310792 | 156631.734192021 | 137475.997774157 | 132309.087730208 | 114711.362606326 | 139102.853967846 | 128396.675837032 | 133710.654377278 | 123043.548262394 | 133408.831326865 |
21 | 148725.958558014 | 118197.2836363 | 129459.554297858 | 131772.938735539 | 112888.674356642 | 123780.816663973 | 116439.828027183 | 146233.070462874 | 137791.601663527 | 126421.048805173 | ⋯ | 131572.433883394 | 170422.750860911 | 146724.83903893 | 130413.308343129 | 122947.558194541 | 119490.790343859 | 143446.055445941 | 133460.273328807 | 131148.811553899 | 140269.916543071 |
22 | 97039.176408294 | 85374.0308524788 | 88230.0504136771 | 97569.3346190287 | 100402.491816687 | 88284.9959109701 | 86621.9132783385 | 116733.544870358 | 82845.4784402866 | 84413.7411444369 | ⋯ | 92753.000032344 | 131082.47946236 | 113553.807122439 | 97738.7238815813 | 87607.118757553 | 89204.6454059429 | 114030.68416878 | 89525.765107035 | 94743.839604483 | 107061.308818826 |
23 | 112395.281942793 | 102193.264499183 | 110792.523994176 | 115054.574906868 | 118659.453732098 | 92951.2384855629 | 82646.3679601227 | 116416.812071968 | 100640.332307679 | 109051.501580675 | ⋯ | 104993.622311072 | 145138.914130567 | 125928.893495496 | 107770.582929666 | 109765.968988571 | 100340.875544316 | 132280.916548836 | 95238.0399367816 | 105492.14587826 | 124174.843881521 |
24 | 104472.194908502 | 75163.9538010076 | 87618.1065590897 | 105327.800304573 | 67916.8077577267 | 78223.0685284079 | 61829.0135858563 | 114773.64850435 | 91263.971450951 | 83036.099059385 | ⋯ | 106372.340568402 | 125763.465986748 | 104563.058223256 | 92216.6611139224 | 73601.0057268241 | 85734.674997926 | 96824.959845073 | 78036.8831707161 | 75527.4119641339 | 97411.3731398957 |
25 | 100455.754414568 | 84006.0226114771 | 80387.5345249498 | 106164.405536884 | 84161.4071412782 | 86496.6540971384 | 87279.4096737598 | 111757.891287372 | 92620.4781730261 | 68795.894753975 | ⋯ | 111773.925921925 | 122262.464448415 | 96529.4272799751 | 83103.3138629261 | 70099.5102835961 | 84342.7388990896 | 101158.01370628 | 89012.678439647 | 83513.5627069041 | 88594.94136236 |
26 | 96423.8025022868 | 76825.3657654814 | 82523.9685727729 | 94425.3390727298 | 74122.927357195 | 77243.8683650683 | 74618.7847059975 | 113683.022967372 | 78641.4727290887 | 68664.1769265459 | ⋯ | 98846.3521734616 | 124776.205227599 | 111162.027226027 | 102549.710555418 | 77292.0036614397 | 88215.8576050814 | 102767.340167001 | 87414.6945598965 | 81665.2573987249 | 103419.008354364 |
27 | 100889.019699866 | 75352.3585828606 | 94413.5022758927 | 96053.8511461149 | 98867.6573759083 | 62402.6812324599 | 73547.6111508729 | 108581.91361364 | 90374.5190858574 | 87919.1821049309 | ⋯ | 83929.1832618428 | 135903.819850658 | 114696.633403078 | 97311.8318088813 | 90015.1743152231 | 88584.5657662778 | 116250.633847734 | 82779.2433282644 | 89867.6872685617 | 104652.760364933 |
34 | 116180.332307151 | 87656.8927751834 | 96869.9419995697 | 98685.6529643494 | 93928.6206861359 | 90219.1201353682 | 84797.7828542704 | 122562.601804139 | 104474.790916278 | 88797.9201614542 | ⋯ | 96047.1122314461 | 109823.537235877 | 85722.9568260451 | 86804.7804501572 | 68493.0690508171 | 77836.1541894254 | 83969.2815915439 | 72144.2504361921 | 72573.3098321966 | 88704.5455486921 |
35 | 117779.650470699 | 91413.8671592007 | 113191.21683682 | 76272.8337286612 | 124334.479690873 | 104864.125047606 | 97652.2335689256 | 110525.995272605 | 101880.854045301 | 103923.933658229 | ⋯ | 0 | 142114.412544963 | 131936.891167709 | 118725.963179921 | 110432.546941561 | 100834.189494437 | 130860.754365088 | 96707.7814500984 | 107177.240625051 | 123061.562280836 |
36 | 136031.239191592 | 129683.258726792 | 120374.432841862 | 138114.10801942 | 129861.00468963 | 125927.362034627 | 133363.211951422 | 144945.732224167 | 130697.111636027 | 131771.386780287 | ⋯ | 142114.412544963 | 0 | 90073.2351256465 | 129402.191140645 | 104821.708185852 | 125556.409147443 | 86763.6260998813 | 105636.157985796 | 102278.617789839 | 92275.5335503404 |
37 | 128620.550613034 | 114292.294867152 | 97251.0110692943 | 125885.089331501 | 112328.335663803 | 112632.708113585 | 111403.195932612 | 130253.294829728 | 119685.714703134 | 112896.807040766 | ⋯ | 131936.891167709 | 90073.2351256465 | 0 | 86477.7952656056 | 82499.9479818018 | 93191.6235076952 | 85492.3813564694 | 94378.0107864115 | 90187.4135841582 | 76670.6436127935 |
38 | 119218.084769048 | 107131.461336061 | 88294.7444189064 | 121789.679324646 | 102728.6884322 | 105694.076082816 | 99165.2819186231 | 117571.748715412 | 112801.238769794 | 102245.018431217 | ⋯ | 118725.963179921 | 129402.191140645 | 86477.7952656056 | 0 | 79367.705264799 | 81027.8421655174 | 107960.405038143 | 93848.7314032534 | 93607.4671754342 | 87276.1244212872 |
28 | 109230.616353658 | 90504.1050063476 | 85095.2870316565 | 110847.203270989 | 82271.0042967752 | 90800.8308772558 | 90331.993064473 | 122765.185867167 | 99492.9307790257 | 82643.2769497919 | ⋯ | 110432.546941561 | 104821.708185852 | 82499.9479818018 | 79367.705264799 | 0 | 82274.160700672 | 84411.8675009622 | 77753.7877341033 | 62455.3597539875 | 70333.3497496031 |
29 | 118539.701636203 | 90631.1349923414 | 90383.4030339641 | 98573.1509286378 | 99771.3776841835 | 103478.898269164 | 77039.4635235734 | 104096.191227153 | 94102.1585831058 | 82259.8661802947 | ⋯ | 100834.189494437 | 125556.409147443 | 93191.6235076952 | 81027.8421655174 | 82274.160700672 | 0 | 98261.8517431867 | 80665.2564738996 | 90737.2606650653 | 97952.7865453556 |
30 | 129537.9602356 | 109795.078560016 | 106316.88261043 | 130349.399553661 | 99908.6183619812 | 113053.022476182 | 112335.677760897 | 139022.393634982 | 119220.509603004 | 106670.772388691 | ⋯ | 130860.754365088 | 86763.6260998813 | 85492.3813564694 | 107960.405038143 | 84411.8675009622 | 98261.8517431867 | 0 | 85488.2100175223 | 85221.709118041 | 89968.411289741 |
31 | 105918.973493893 | 84080.5244274796 | 96237.0595560775 | 103235.654112327 | 94718.8465776479 | 83822.0093352575 | 79434.8565933117 | 115532.774510093 | 87250.9331984478 | 89961.29273193 | ⋯ | 96707.7814500984 | 105636.157985796 | 94378.0107864115 | 93848.7314032534 | 77753.7877341033 | 80665.2564738996 | 85488.2100175223 | 0 | 67921.844520007 | 89268.4855030038 |
32 | 108391.366247501 | 88449.7392703902 | 94266.5720974302 | 111531.040078536 | 90727.6477155668 | 85829.852883481 | 91030.549064586 | 119322.482630056 | 100756.936431196 | 91779.1921025676 | ⋯ | 107177.240625051 | 102278.617789839 | 90187.4135841582 | 93607.4671754342 | 62455.3597539875 | 90737.2606650653 | 85221.709118041 | 67921.844520007 | 0 | 78707.6214289315 |
33 | 121993.463816714 | 100042.754050456 | 95023.7579976713 | 127448.219006779 | 104320.326796842 | 100606.532277979 | 110556.010537646 | 120064.588101571 | 117482.216407421 | 109075.024666511 | ⋯ | 123061.562280836 | 92275.5335503404 | 76670.6436127935 | 87276.1244212872 | 70333.3497496031 | 97952.7865453556 | 89968.411289741 | 89268.4855030038 | 78707.6214289315 | 0 |
Out[4]:
- 72
- 72
In [5]:
# cmdscale takes the distance matrix and produces a 72x2 matrix,
# while preserving the distances between the 72 subjects.
MDS=cmdscale(Golub.distance.matrix)
as.matrix(dist(MDS))[,1:3]
Out[5]:
39 | 40 | 42 | |
---|---|---|---|
39 | 0 | 16038.2434439186 | 29333.9767138976 |
40 | 16038.2434439186 | 0 | 13605.9705171535 |
42 | 29333.9767138976 | 13605.9705171535 | 0 |
47 | 23552.9423836755 | 39417.4121693621 | 52201.8324270926 |
48 | 64453.9849933956 | 49578.2133702788 | 40471.7633405931 |
49 | 23270.2385111487 | 7549.20365531733 | 8813.71057559277 |
41 | 18897.4685257976 | 12677.0867586647 | 22203.3989837184 |
43 | 24877.9275196879 | 39980.5345240873 | 52022.3289901934 |
44 | 11608.8365898732 | 19495.1874893196 | 32620.364496308 |
45 | 21986.180776137 | 12334.7689293691 | 19276.3804044188 |
46 | 12353.1974412281 | 27480.5752421663 | 41065.2781010112 |
70 | 11567.2338485458 | 25791.26659 | 39396.0382737991 |
71 | 32141.6244376901 | 16404.8423103425 | 8766.06274749528 |
72 | 21340.2131000514 | 8112.77062347784 | 14064.7548887292 |
68 | 46960.488117714 | 30967.8411939793 | 19043.3116656109 |
69 | 60749.4126702763 | 44960.9774229933 | 33626.7933561275 |
67 | 23261.0401743963 | 13495.3406318589 | 13501.6078622663 |
55 | 28240.2538157187 | 43921.7962040931 | 56440.5931774652 |
56 | 43148.7246669143 | 59156.8970358259 | 72449.6131464191 |
59 | 60222.2613998395 | 75936.3966078412 | 88284.3689607091 |
52 | 68650.7572883528 | 56900.8126502351 | 45852.5701350388 |
53 | 66068.7537285338 | 55304.8679704626 | 45202.5765837667 |
51 | 34445.4423128612 | 39368.2090843029 | 44407.9441518336 |
50 | 61453.661678579 | 50071.0256127093 | 39578.0327057698 |
54 | 38451.9163659507 | 34484.9386406519 | 32855.4111082329 |
57 | 22336.9612840752 | 26702.3531153925 | 33277.0791621479 |
58 | 78923.9514118395 | 90817.4184905822 | 99201.3293937777 |
60 | 47008.8860969912 | 58032.4652076926 | 66401.4950917156 |
61 | 31936.1618089788 | 38859.938899306 | 45389.6195458329 |
65 | 41662.856109954 | 30493.639778679 | 21729.9704590229 |
⋮ | ⋮ | ⋮ | |
9 | 84087.7696390347 | 68371.5945561943 | 56914.274729172 |
10 | 43710.1617488486 | 28192.5929461727 | 18740.4977464523 |
11 | 47169.6007661671 | 32728.4667497134 | 25695.9432660911 |
12 | 15850.4255260353 | 2929.89908810448 | 13527.1845868817 |
13 | 38671.0422776345 | 23960.2909171867 | 17704.451396332 |
14 | 42313.08972269 | 26675.2000847951 | 16940.5991114831 |
15 | 63945.9652041395 | 48678.7726781357 | 38722.0035208157 |
16 | 39902.3225899748 | 26454.0318696019 | 22482.6239271085 |
17 | 60941.4322628754 | 44903.2095498701 | 32095.0352657368 |
18 | 28732.0890684024 | 12930.6532369879 | 806.15210698752 |
19 | 27388.7200989158 | 14671.5210875502 | 16341.8068731787 |
20 | 98883.0366662971 | 83615.3417773099 | 73062.1020631234 |
21 | 52623.723702172 | 41433.7327812249 | 38914.2628467944 |
22 | 16569.0756607657 | 14722.6495181326 | 25846.3448111743 |
23 | 21246.9946132969 | 19170.8405936012 | 28929.5176668179 |
24 | 49496.9676313078 | 34589.9709344227 | 26278.4788852099 |
25 | 38157.8211572865 | 22204.4697786791 | 11306.9727314499 |
26 | 30319.8546655358 | 17702.3863860653 | 18000.4687557743 |
27 | 14045.3614142255 | 14890.9269070635 | 27010.6740378817 |
34 | 44953.2503770168 | 31688.2255131206 | 20194.2070803233 |
35 | 32381.1640585531 | 48259.4765952585 | 60995.7334108935 |
36 | 96731.9412224668 | 89595.7421322603 | 81809.3499966411 |
37 | 79560.1639081888 | 68649.0636042383 | 58065.0319637561 |
38 | 46883.7881826109 | 31260.9442558865 | 17656.2585875449 |
28 | 55168.9428012 | 39794.5152995221 | 26215.2530753636 |
29 | 25746.277934528 | 13666.8813860563 | 10084.299011982 |
30 | 82071.025867796 | 69898.0393157701 | 58247.5862369418 |
31 | 37452.564965451 | 27353.9259767576 | 20502.1114293489 |
32 | 50914.9297093897 | 37305.236361112 | 25121.432732103 |
33 | 69612.2141257714 | 58251.0484538573 | 47514.468595391 |
In [6]:
# Now, we just plot the result.
plot(MDS,col=c(2,4)[ pData(Golub_Merge)[["ALL.AML"]]],pch=19,type="n",xlab="",ylab="")
text(MDS,as.character(pData(Golub_Merge)[["ALL.AML"]]),col=c(2,4)[pData(Golub_Merge)[["ALL.AML"]]])
# We observe the clustering of AML patients together and ALL patients together.
Heat Map¶
We will now generate the ‘noisy’ heatmap from this morning’s lecture. We will generate data with no correspondence between gene expression and group membership.
In [7]:
# Simulate noisy HM
# load the necessary libraries
library(pheatmap)
library(genefilter)
Attaching package: ‘genefilter’
The following object is masked from ‘package:base’:
anyNA
In [8]:
# n = number of members of each group, m = number of genes, alpha = significance level
# The data simulation is almost exactly the same as the noise simulation from the supervised learning lab
simulate.noise.heatmap=function(n,m,alpha)
{
# Simulate Expression Matrix
EXPRS=matrix(rnorm(2*n*m),m,2*n) # Just randomly chosen, normally distributed values (N(0,1))
rownames(EXPRS)=paste("Gene",1:m,sep="")
colnames(EXPRS)=paste("patient id",1:(2*n),sep="")
# Assign first n patients to group 1, second n patients to group 2
grp=factor(rep(0:1,c(n,n)))
# Get the two sample t-statistics
# run rowttests on the expression values and extract the p-values for each gene
pvals=rowttests(EXPRS, grp)$p.value
#Choose the genes that have p-values smaller than significance level
topgenes=which(pvals<alpha)
EXPRS=EXPRS[topgenes,]
# Just annotate which group each patient is in
annodat=data.frame(Condition=ifelse(grp==0,"N","Y"),row.names=colnames(EXPRS))
# draw heatmap
pheatmap(EXPRS,
border_color =NA,
show_rownames = FALSE,
show_colnames=FALSE,
annotation_col=annodat,
color=colorRampPalette(c("red3", "black", "green3"))(50),
annotation_colors=list(Condition=c(Y="blue",N="yellow")))
return(length(topgenes)) # return the number of topgenes found
}
In [9]:
set.seed(765)
# run with sample of 20000 genes and significance 0.005
aa=simulate.noise.heatmap(20,20000,0.005)
print(aa)
[1] 104
In [10]:
#Now simulate 40000 genes with significance 0.0025
aa=simulate.noise.heatmap(20,40000,0.0025)
print(aa)
[1] 99
In [11]:
# And now a sample of only 6 patients, with 20000 genes and significance level of 0.005
aa=simulate.noise.heatmap(3,20000,0.005)
print(aa)
[1] 110
In [ ]: