tag:blogger.com,1999:blog-42748233668559676192024-03-05T16:13:40.228+01:00Statistic on aiR<img src="http://img694.imageshack.us/img694/1906/front2s.jpg"><br>Solved problems of statistic with RTodos Logoshttp://www.blogger.com/profile/09881188152777475558noreply@blogger.comBlogger40125tag:blogger.com,1999:blog-4274823366855967619.post-3329426577115718712015-01-20T19:51:00.000+01:002015-01-20T19:51:23.925+01:00Adjustment for Multiple Comparison Tests with R: Resources on the web<b><span style="font-size: large;">1. Bonferroni correction</span></b><br />
<br />
<span style="font-family: Courier New, Courier, monospace;">p.adjust(p, method = "bonferroni")</span><br />
<br />
Read: <a href="http://en.wikipedia.org/wiki/Bonferroni_correction">http://en.wikipedia.org/wiki/</a><br />
<br />
<b><span style="font-size: large;">2. Sidak (Dunn-Sidak) correction</span></b><br />
<br />
Read: <a href="http://en.wikipedia.org/wiki/%C5%A0id%C3%A1k_correction">http://en.wikipedia.org/wiki/</a><br />
<br />
<span style="font-size: large;"><b>3. Holm-Bonferroni correction</b></span><br />
<br />
<span style="font-family: Courier New, Courier, monospace;">p.adjust(p, method = "holm")</span><br />
<br />
Read: <a href="http://en.wikipedia.org/wiki/Holm%E2%80%93Bonferroni_method">http://en.wikipedia.org/wiki/</a><br />
<br />
<span style="font-size: large;"><b>4. Hochberg correction</b></span><br />
<br />
<span style="font-family: Courier New, Courier, monospace;">p.adjust(p, method = "hochberg")</span><br />
<br />
Read: <a href="http://stats.stackexchange.com/questions/71466/what-are-hommel-hochberg-corrections">http://stats.stackexchange.com/questions/</a><br />
Read: <a href="http://onbiostatistics.blogspot.it/">http://onbiostatistics.blogspot.it/</a><br />
<br />
<span style="font-size: large;"><b>5. Hommel correction</b></span><br />
<br />
<span style="font-family: Courier New, Courier, monospace;">p.adjust(p, method = "hommel")</span><br />
<br />
Read: <a href="http://stats.stackexchange.com/questions/71466/what-are-hommel-hochberg-corrections">http://stats.stackexchange.com/questions</a><br />
<br />
<span style="font-size: large;"><b>6. Benjamini-Hochberg correction</b></span><br />
<br />
<span style="font-family: Courier New, Courier, monospace;">p.adjust(p, method = "BH")</span><br />
or equivalently<br />
<span style="font-family: Courier New, Courier, monospace;">p.adjust(p, method = "fdr")</span><br />
<br />
Read: <a href="http://nebc.nerc.ac.uk/courses/GeneSpring/GS_Mar2006/Multiple%20testing%20corrections.pdf">http://nebc.nerc.ac.uk/courses/</a><br />
Read: <a href="http://en.wikipedia.org/wiki/False_discovery_rate#Benjamini.E2.80.93Hochberg.E2.80.93Yekutieli_procedure">http://en.wikipedia.org/wiki/</a><br />
<br />
<b><span style="font-size: large;">7. Benjamini–Yekutieli (Benjamini–Hochberg–Yekutieli) correction</span></b><br />
<br />
<span style="font-family: Courier New, Courier, monospace;">p.adjust(p, method = "BY")</span><br />
<br />
Read: <a href="http://en.wikipedia.org/wiki/False_discovery_rate#Benjamini.E2.80.93Hochberg.E2.80.93Yekutieli_procedure">http://en.wikipedia.org/wiki/</a>Todos Logoshttp://www.blogger.com/profile/09881188152777475558noreply@blogger.com0tag:blogger.com,1999:blog-4274823366855967619.post-59234756409739224212014-02-05T20:50:00.000+01:002014-02-05T21:02:10.922+01:00ggPlot2: Histogram with jittered stripchartHere is an example of a Histogram plot, with a stripchart (vertically jittered) along the x side of the plot.<br />
<br />
<script src="https://gist.github.com/statistic-on-air/8831527.js"></script><br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhO8sJZSd5L3L2Q3yGdZ5sY1SsTsMLaQL7GF-9QRNpLr3dm9Uwo-Jyg7prZKz5vRwoBXvmhZXv6J0CpDOtbrONOXjgQqa-fCgEoDw2lQcQg4N3hqaMehSGjX6CbMfHEO3Q5ScU7w8iJTHU/s1600/hist1.jpeg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhO8sJZSd5L3L2Q3yGdZ5sY1SsTsMLaQL7GF-9QRNpLr3dm9Uwo-Jyg7prZKz5vRwoBXvmhZXv6J0CpDOtbrONOXjgQqa-fCgEoDw2lQcQg4N3hqaMehSGjX6CbMfHEO3Q5ScU7w8iJTHU/s1600/hist1.jpeg" height="153" width="320" /></a></div>
<br />
<br />
Alternatively, using the geom_rug function:<br />
<br />
<script src="https://gist.github.com/statistic-on-air/8831562.js"></script><br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh6_WqjeeTco8JU7HVwm4YLffOfkBRXDSmpyeXMHE3NpxldTAPXm7Dpus1w_v4zTeYaqxorZ_4-hyCK-6fE7Y6dqrhjQEtGGj8zHD_rHho3Aca07FFvnLai7IOBJffys8gDXBorTdXDzcw/s1600/hist2.jpeg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh6_WqjeeTco8JU7HVwm4YLffOfkBRXDSmpyeXMHE3NpxldTAPXm7Dpus1w_v4zTeYaqxorZ_4-hyCK-6fE7Y6dqrhjQEtGGj8zHD_rHho3Aca07FFvnLai7IOBJffys8gDXBorTdXDzcw/s1600/hist2.jpeg" height="153" width="320" /></a></div>
<br />
Of course this simplicistic method need to be adjusted in vertical position of the stripchart or rugchart (y=-2, here), and the relative proportion of points jittering.Todos Logoshttp://www.blogger.com/profile/09881188152777475558noreply@blogger.com2tag:blogger.com,1999:blog-4274823366855967619.post-41453225940542850382014-02-02T13:14:00.000+01:002014-02-02T13:14:21.400+01:00Boxplot with mean and standard deviation in ggPlot2 (plus Jitter)When you create a <b>boxplot </b>in R, it automatically computes median, first and third quartile ("<i>hinges</i>") and 95% confidence interval of median ("<i>notches</i>").<br />
<div>
<br /></div>
<div>
But we would like to change the default values of boxplot graphics with the <b>mean</b>, the mean + standard deviation, the mean - S.D., the min and the max values.</div>
<div>
Here is an example solved using <b>ggplot2</b> package. Plus here are represented points (the single values) jittered horizontally.</div>
<script src="https://gist.github.com/statistic-on-air/8767421.js"></script>
<span style="font-family: Courier New, Courier, monospace;"><b><!-----></b></span><br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi5b3Q6hNmDPHK4HsGAhp0xgwS8Oj2dNJbJPCLNstPUeVUPjwXhu3gUfNFcNe_b9Msr17eW2hdugCINuglBnjDps59UQi3nV2OdT0-zAVMHErC9WgqGrzysKP8AEA55XAwTqhjncn9P4a8/s1600/Immaginet3.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi5b3Q6hNmDPHK4HsGAhp0xgwS8Oj2dNJbJPCLNstPUeVUPjwXhu3gUfNFcNe_b9Msr17eW2hdugCINuglBnjDps59UQi3nV2OdT0-zAVMHErC9WgqGrzysKP8AEA55XAwTqhjncn9P4a8/s1600/Immaginet3.jpg" height="320" width="270" /></a></div>
Todos Logoshttp://www.blogger.com/profile/09881188152777475558noreply@blogger.com8tag:blogger.com,1999:blog-4274823366855967619.post-68444952995839949262011-09-17T22:44:00.009+02:002012-03-29T05:50:23.445+02:00Implementation of the CDC Growth Charts in RI implemented in R a function to re-create the <a href="http://www.cdc.gov/growthcharts/clinical_charts.htm" target="_blank">CDC Growth Chart</a>, according to the <a href="http://www.cdc.gov/growthcharts/data_tables.htm" target="_blank">data provided by the CDC</a>.<br /><br />In order to use this function, you need to download the .rar file available at this <a href="http://www.megaupload.com/?d=A2DYUNQ4" target="_blank">megaupload link</a>.<br /><br /><b>Mirror: <a href="http://www.mediafire.com/?wye2iu1d68c9a0n" target="_blank">mediafire link</a>.</b><br /><br />Then unrar the file, and put the <b>Growth</b> folder in your main directory, as selected in R. You are now able to use the two functions i'm going to illustrate.<br /><br /><span class="fullpost"><br /><center><hr width="50%"></center><br /><br /><b><big>growthFun.R</big></b><br /><br /><script src="https://gist.github.com/1224344.js?file=growthFun.R"></script><br /><br />The function <code style="color: rgb(153, 0, 0);">growthFun</code> allows you to draw 8 different growth chart, which are different for Male and Female (sixteen in total).<br />The only parameters you need to input are:<br /><code style="color: rgb(153, 0, 0);"> sex = c("m", "f")</code><br /><code style="color: rgb(153, 0, 0);"> type = c("wac36", "lac36", "wlc", "hac", "wsc", "wac20", "lac20", "bac")</code><br />The explanation for the type's parameters code are in the first part of the function code.<br />Eventually you can modify the <code style="color: rgb(153, 0, 0);">pat</code> variable, if you want to put the <b>Growth</b> folder in another place (not in the main directory of R).<br /><br />I reccomend to use the <code style="color: rgb(153, 0, 0);">pdf()</code> graphic device for best resolution.<br /><br />Hese is an example of the output you can obtain, with the following code:<br /><br /><pre>pdf("hac_example.pdf", paper="a4", width=0, height=0)<br />growthFun("m", "hac")<br />dev.off()<br /></pre><br /><br /><iframe src="https://docs.google.com/gview?a=v&pid=explorer&chrome=false&api=true&embedded=true&srcid=0B5Soe5lALqepYzlhN2FjYTgtY2Q2ZS00MjRmLThkMGUtMGM5MjU2ZTg0M2Iz&hl=en" frameborder="0" height="560px" width="100%"></iframe><br /><br /><center><hr width="50%"></center><br /><br /><b><big>MygrowthFun.R</big></b><br /><br /><script src="https://gist.github.com/1224345.js?file=MygrowthFun.R"></script><br /><br />The function <code style="color: rgb(153, 0, 0);">MygrowthFun</code> allows you to personalize the output of the previous function, with specific patient's data.<br />The parameters you can modify are:<br /><code style="color: rgb(153, 0, 0);"> sex=c("m", "f")<br />type=c("wac36", "lac36", "wlc", "hac", "wsc", "wac20", "lac20", "bac", "bmi.adv")<br />path="./Growth/"<br />name = NULL<br />surname = NULL<br />birth_date = NULL<br />mydataAA = NULL</code><br /><br />The three parameter <code style="color: rgb(153, 0, 0);">sex</code>, <code style="color: rgb(153, 0, 0);">type</code> and <code style="color: rgb(153, 0, 0);">path</code> are the same of the <code style="color: rgb(153, 0, 0);">growthFun</code> function. The three parameters <code style="color: rgb(153, 0, 0);">name</code>, <code style="color: rgb(153, 0, 0);">surname</code> and <code style="color: rgb(153, 0, 0);">birth_date</code> refer to the patient's data; you can add this data in form of <code style="color: rgb(153, 0, 0);">character()</code>.<br /><code style="color: rgb(153, 0, 0);">mydataAA</code> is an optional parameters with the values measured on your patients during the time you follow up him. Generally you need to input this data in form of a <code style="color: rgb(153, 0, 0);">data.frame()</code>.<br />In the <code style="color: rgb(153, 0, 0);">type</code> parameter there is an additional choice: <code style="color: rgb(153, 0, 0);">bmi.adv</code> allows you to obtain three chart (<code style="color: rgb(153, 0, 0);">wac20, lac20, bac</code> - see the explanation codes), if your <code style="color: rgb(153, 0, 0);">mydataAA</code> dataframe contains data about Stature and Weight during the time of follow up.<br /><br /><b><big>Details.</big></b><br />Let's see the format of <code style="color: rgb(153, 0, 0);">mydataAA</code>, according to the <code style="color: rgb(153, 0, 0);">type</code> of chart you want to graph.<br /><br /><pre><b>type = wac36</b><br /><b>mydataAA: </b><br />first column = months of measurement, from 0 to 36<br />second column = weight (in kg)<br /><br /><b>type = lac36</b><br /><b>mydataAA: </b><br />first column = months of measurement, from 0 to 36<br />second column = length (in cm)<br /><br /><b>type = hac</b><br /><b>mydataAA: </b><br />first column = months of measurement, from 0 to 36<br />second column = head circumference (in cm)<br /><br /><b>type = wac20</b><br /><b>mydataAA: </b><br />first column = months of measurement, from 24 to 240 (from 2 to 20 years)<br />second column = weight (in kg)<br /><br /><b>type = lac20</b><br /><b>mydataAA: </b><br />first column = months of measurement, from 24 to 240 (from 2 to 20 years)<br />second column = stature (in cm)<br /><br /><b>type = bmi.adv</b><br /><b>mydataAA: </b><br />first column (<b>months</b>) = months of measurement, from 24 to 240 (from 2 to 20 years)<br />second column (<b>stature</b>) = stature (in cm)<br />third column (<b>weight</b>)= weight (in kg)</pre><br />In the last type it's not importat the order of the columns, but here are important their names.<br /><br /><b><big>Examples.</big></b><br />Let's see some example. Suppose that you are following the growth of a new born (her name is Alyssa Gigave, born on 07/08/2009), and you collect the following data:<br /><br /><pre>Months Length<br />0 50<br />2 55<br />3 56<br />5 61<br />8 71<br />9 72<br />12 75<br />15 75<br />18 81<br />21 89<br />26 91<br />27 94<br />30 95<br />35 98</pre><br /><br />So you can create your personalized graph in this way:<br /><br /><pre>alyssa_data <- data.frame( months=c(0, 2, 3, 5, 8, 9, 12, 15, 18, 21, 26, 27, 30, 35), length=c(50, 55, 56, 61, 71, 72, 75, 75, 81, 89, 91, 94, 95, 98)) pdf("alyssa_growth_chart.pdf", paper="a4", width=0, height=0) MygrowthFun(sex="f", type="lac36", name="Alyssa", surname="Gigave", birth_date="july 08, 2009", mydataAA=alyssa_data) dev.off()</pre><br /><br />The output is the following pdf file:<br /><br /><iframe src="https://docs.google.com/gview?a=v&pid=explorer&chrome=false&api=true&embedded=true&srcid=0B5Soe5lALqepOWRiZWYyYjItNGVhMi00OWY4LThlZDEtZTE1NjY5MTVjZjQw&hl=en" frameborder="0" height="560px" width="100%"></iframe><br /><br />Now suppose that you're a pediatric doctor, and that you follow a boy (Tommy Cigalino, born on 07/08/1980). Whenever he has come to you, you collect his weight and stature, and the months from his birth he was. So you have the following data:<br /><br /><pre> months stature weight<br /> 25 98 17<br /> 31 100 21<br /> 34 102 27<br /> 35 104 29<br /> 58 106 30<br /> 60 109 32<br /> 70 111 33<br /> 85 118 34<br /> 88 119 36<br /> 89 120 39<br /> 91 121 42<br /> 102 126 45<br /> 107 128 47<br /> 108 135 49<br /> 120 144 51<br /> 134 145 52<br /> 154 148 54<br /> 166 152 55<br /> 169 157 62<br /> 170 158 63<br /> 178 163 64<br /> 179 167 68<br /> 181 168 71<br /> 219 169 74<br /> 234 176 76</pre><br /><br />So you can create three graphs (<code style="color: rgb(153, 0, 0);">wac20, lac20, bac</code>), using the <code style="color: rgb(153, 0, 0);">bmi.adv</code> type:<br /><br /><pre>tommy_data <- data.frame( months = c( 25, 31, 34, 35, 58, 60, 70, 85, 88, 89, 91, 102, 107, 108, 120, 134, 154, 166, 169, 170, 178, 179, 181, 219, 234), stature = c( 98, 100, 102, 104, 106, 109, 111, 118, 119, 120, 121, 126, 128, 135, 144, 145, 148, 152, 157, 158, 163, 167, 168, 169, 176), weight = c( 17, 21, 27, 29, 30, 32, 33, 34, 36, 39, 42, 45, 47, 49, 51, 52, 54, 55, 62, 63, 64, 68, 71, 74, 76)) pdf("tommy_growth_chart.pdf", paper="a4", width=0, height=0) MygrowthFun(sex="m", type="bmi.adv", name="Tommy", surname="Cigalino", birth_date="july 08, 1980", mydataAA=tommy_data) dev.off()</pre><br /><br /><iframe src="https://docs.google.com/gview?a=v&pid=explorer&chrome=false&api=true&embedded=true&srcid=0B5Soe5lALqepMjljMjhiZGQtNTkxNi00Zjg2LWJkYWMtMzkxMDlhMDNjZjM4&hl=en" frameborder="0" height="560px" width="100%"></iframe><br /><br /><br /><br /><br /><i><b>Tommaso MARTINO</b>, 17/09/2011</i><br /><br /><br /><br /><b>REFERENCES</b><ul><br /><li>http://www.cdc.gov/growthcharts/cdc_charts.htm</li><br /><li>http://www.cdc.gov/growthcharts/clinical_charts.htm</li><br /><li>http://www.cdc.gov/growthcharts/percentile_data_files.htm</li><br /><li>Kuczmarski RJ, Ogden CL, Guo, SS, et al. CDC growth charts for the United States: Methods and Development. Vital Health Stat; 11 (246) National Center for Health Statistics. 2002.</li><br /></ul><br /><br /><br /></span>Todos Logoshttp://www.blogger.com/profile/09881188152777475558noreply@blogger.com8tag:blogger.com,1999:blog-4274823366855967619.post-89841420026356548962011-09-07T16:43:00.007+02:002011-09-07T17:07:17.623+02:00R is a cool sound editor!Capabilities of R are definitely unless! After my previous posts about some easy image editing in R (they are <a href="http://statistic-on-air.blogspot.com/2010/11/r-is-cool-image-editor.html">here</a>, and <a href="http://statistic-on-air.blogspot.com/2011/08/r-is-cool-image-editor-2-dithering.html">here</a>), now is the time to explore if R is capable of sound editing!<br /><br />Just for fun, here I created a function that receives a phone number (or another sequence of numbers), and returns the equivalent melody you can listen if you press that sequence on your house' phone... =D<br /><br /><span class="fullpost"><br /><br />It requires the <code style="color: rgb(255, 0, 0);">sound</code> library, and here's the code.<br /><br /><script src="https://gist.github.com/1200780.js?file=PlayTel.R"></script><br /><br />Now you can simply create your phone melody =)<br /><br /><code style="color: rgb(255, 0, 0);">s2 <- PlayTel("556c885a4623#")</code><br /><br />You can listen to it with the command:<br /><br /><code style="color: rgb(255, 0, 0);">play(s2)</code><br /><br />(<span style="font-style:italic;">NOTE</span>: in Windows 7 I was unable to find a wave player that works on batch mode - i.e. mplay32.exe. So this command doesn't work on Windows 7. It works on Windows XP)<br /><br />You can save the output using the command:<br /><br /><code style="color: rgb(255, 0, 0);">saveSample(s2, "tel.wav")</code><br /><br />(This command works on Windows 7)<br /><br />Here is an example of the output:<br /><br /><script type="text/javascript">var zippywww="www28";var zippyfile="67433758";var zippydown="ffffff";var zippyfront="000000";var zippyback="ffffff";var zippylight="000000";var zippywidth=300;var zippyauto=false;var zippyvol=60;</script><script type="text/javascript" src="http://api.zippyshare.com/api/embed.js"></script><br /><br />Have fun!! =)<br /></span>Todos Logoshttp://www.blogger.com/profile/09881188152777475558noreply@blogger.com0tag:blogger.com,1999:blog-4274823366855967619.post-49218748826518313202011-08-29T11:11:00.002+02:002011-08-29T11:20:07.883+02:00R is a cool image editor #2: Dithering algorithmsHere I implemented in R some dithering algorithms:
<br />- <b>Floyd-Steinberg dithering</b>
<br />- <b>Bill Atkinson dithering</b>
<br />- <b>Jarvis-Judice-Ninke dithering</b>
<br />- <b>Sierra 2-4a dithering</b>
<br />- <b>Stucki dithering</b>
<br />- <b>Burkes dithering</b>
<br />- <b>Sierra2 dithering</b>
<br />- <b>Sierra3 dithering</b>
<br />
<br />For each algorithm, I wrote a 2-dimensional convolution function (a matrix passing over a matrix); it is <i>slow</i> because I didn't implemented any fasting tricks. It can be easily implemented in C, then used in R for a faster solution.
<br />Then, a function to transform a grey image in a grey-dithered image is provided, with an example. The library <b>rimage</b> was used for loading and displaying images (see the other post <a href="http://statistic-on-air.blogspot.com/2010/11/r-is-cool-image-editor.html">R is a cool image editor</a>).
<br />These function can be easily re-coded for a RGB image.
<br />Only the first code is commented, 'cause they're all very similar.
<br />
<br /><pre>
<br />library(rimage)
<br />y <- read.jpeg("valve.jpg")
<br />plot(y)
<br /></pre>
<br /><center><a target='_blank' href='http://img695.imageshack.us/img695/9885/originalys.jpg'><img src='http://img695.imageshack.us/img695/9885/originalys.th.jpg' border='0'/></a></center>
<br /><span class="fullpost">
<br /><center><hr width=50%></center>
<br />
<br /><center><big><u>Floyd-Steinberg dithering</u></big></center>
<br />
<br /><script src="https://gist.github.com/1178029.js?file=Floyd-Steinberg%20dithering"></script>
<br />
<br /><pre>plot(normalize(grey2FSdith(rgb2grey(y))))</pre>
<br />
<br /><center><a target='_blank' href='http://img190.imageshack.us/img190/5848/fsdith.jpg'><img src='http://img190.imageshack.us/img190/5848/fsdith.th.jpg' border='0'/></a></center>
<br />
<br /><center><hr width=50%></center>
<br />
<br /><center><big><u>Bill Atkinson dithering</u></big></center>
<br />
<br /><script src="https://gist.github.com/1178030.js?file=Bill%20Atkinson%20dithering"></script>
<br />
<br /><pre>plot(normalize(grey2ATKdith(rgb2grey(y))))</pre>
<br />
<br /><center><a target='_blank' href='http://img855.imageshack.us/img855/8114/atkdith.jpg'><img src='http://img855.imageshack.us/img855/8114/atkdith.th.jpg' border='0'/></a></center>
<br />
<br /><center><hr width=50%></center>
<br />
<br /><center><big><u>Jarvis-Judice-Ninke dithering</u></big></center>
<br />
<br /><script src="https://gist.github.com/1178033.js?file=Jarvis-Judice-Ninke%20dithering"></script>
<br />
<br /><pre>plot(normalize(grey2JJNdith(rgb2grey(y))))</pre>
<br />
<br /><center><a target='_blank' href='http://img717.imageshack.us/img717/1548/jjndith.jpg'><img src='http://img717.imageshack.us/img717/1548/jjndith.th.jpg' border='0'/></a></center>
<br />
<br /><center><hr width=50%></center>
<br />
<br /><center><big><u>Sierra 2-4a dithering filter</u></big></center>
<br />
<br /><script src="https://gist.github.com/1178034.js?file=Sierra%202-4a%20dithering"></script>
<br />
<br /><pre>plot(normalize(grey2S24adith(rgb2grey(y))))</pre>
<br />
<br /><center><a target='_blank' href='http://img535.imageshack.us/img535/8584/s24adith.jpg'><img src='http://img535.imageshack.us/img535/8584/s24adith.th.jpg' border='0'/></a></center>
<br />
<br /><center><hr width=50%></center>
<br />
<br /><center><big><u>Stucki dithering</u></big></center>
<br />
<br /><script src="https://gist.github.com/1178036.js?file=Stucki%20dithering"></script>
<br />
<br /><pre>plot(normalize(grey2Stucki(rgb2grey(y))))</pre>
<br />
<br /><center><a target='_blank' href='http://img714.imageshack.us/img714/108/stuckidith.jpg'><img src='http://img714.imageshack.us/img714/108/stuckidith.th.jpg' border='0'/></a></center>
<br />
<br /><center><hr width=50%></center>
<br />
<br /><center><big><u>Burkes dithering</u></big></center>
<br />
<br /><script src="https://gist.github.com/1178037.js?file=Burkes%20dithering"></script>
<br />
<br /><pre>plot(normalize(grey2Burkes(rgb2grey(y))))</pre>
<br />
<br /><center><a target='_blank' href='http://img153.imageshack.us/img153/3861/burkesdith.jpg'><img src='http://img153.imageshack.us/img153/3861/burkesdith.th.jpg' border='0'/></a></center>
<br />
<br /><center><hr width=50%></center>
<br />
<br /><center><big><u>Sierra2 dithering</u></big></center>
<br />
<br /><script src="https://gist.github.com/1178039.js?file=Sierra2%20dithering"></script>
<br />
<br /><pre>plot(normalize(grey2Sierra2(rgb2grey(y))))</pre>
<br />
<br /><center><a target='_blank' href='http://img97.imageshack.us/img97/9585/sierra2dith.jpg'><img src='http://img97.imageshack.us/img97/9585/sierra2dith.th.jpg' border='0'/></a></center>
<br />
<br /><center><hr width=50%></center>
<br />
<br /><center><big><u>Sierra3 dithering</u></big></center>
<br />
<br /><script src="https://gist.github.com/1178040.js?file=Sierra3%20dithering"></script>
<br />
<br /><pre>plot(normalize(grey2Sierra3(rgb2grey(y))))</pre>
<br />
<br /><center><a target='_blank' href='http://img692.imageshack.us/img692/4319/sierra3dith.jpg'><img src='http://img692.imageshack.us/img692/4319/sierra3dith.th.jpg' border='0'/></a></center>
<br /></span>Todos Logoshttp://www.blogger.com/profile/09881188152777475558noreply@blogger.com0tag:blogger.com,1999:blog-4274823366855967619.post-10341940007030205712011-08-25T23:30:00.007+02:002011-08-25T23:46:16.169+02:00Benford's law, or the First-digit law<b>Benford's law</b>, also called the <i>first-digit law</i>, states that in lists of numbers from many (but not all) real-life sources of data, the leading digit is distributed in a specific, non-uniform way. According to this law, the first digit is 1 about 30% of the time, and larger digits occur as the leading digit with lower and lower frequency, to the point where 9 as a first digit occurs less than 5% of the time.
<br /><span style="font-style:italic;">Wikipedia, retrieved 08/25/2011</span>
<br />
<br /><iframe width="500" height="345" src="http://www.youtube.com/embed/6KmeGpjeLZ0" frameborder="0" allowfullscreen></iframe>
<br /><iframe width="500" height="345" src="http://www.youtube.com/embed/SZUDoEdjTzg" frameborder="0" allowfullscreen></iframe>
<br />
<br /><span class="fullpost">
<br /><span style="font-weight:bold;">R simulation:</span>
<br /><code style="color: rgb(153, 0, 0);">
<br />library(MASS)
<br />benford <- function(m, n){
<br />list <- c()
<br />
<br /># compute all m^n, for n= 1, 2, ..., i, ..., n
<br />for(i in 1:n){
<br />list[i] <- m^i
<br />}
<br />
<br /># a function to extract the first digit from a number
<br />bben <- function(k){
<br />as.numeric(head(strsplit(as.character(k),'')[[1]],n=1))
<br />}
<br />
<br /># extract the first digit from all numbers computed
<br />first.digit <- sapply(list, bben)
<br />
<br /># plot frequency of first digits
<br />truehist(first.digit, nbins=10, main=m)
<br />}
<br />
<br />par(mfrow=c(2,2))
<br />benford(2,1000)
<br />benford(3,640) # if n is greater, it returns "inf" (on my pc)
<br />benford(4,500)
<br />benford(5,440)
<br /></code>
<br />
<br /><a target='_blank' href='http://img840.imageshack.us/img840/2315/benford.jpg'><img src='http://img840.imageshack.us/img840/2315/benford.th.jpg' border='0'/></a>
<br /></span>Todos Logoshttp://www.blogger.com/profile/09881188152777475558noreply@blogger.com0tag:blogger.com,1999:blog-4274823366855967619.post-30681933432168604992011-06-16T09:52:00.002+02:002011-06-16T09:56:08.428+02:00How to plot points, regression line and residuals<code style="color: rgb(153, 0, 0);"><br />x <- c(173, 169, 176, 166, 161, 164, 160, 158, 180, 187)<br />y <- c(80, 68, 72, 75, 70, 65, 62, 60, 85, 92)<br /><br /><span style="font-style:italic;"># plot scatterplot and the regression line</span><br />mod1 <- lm(y ~ x)<br />plot(x, y, xlim=c(min(x)-5, max(x)+5), ylim=c(min(y)-10, max(y)+10))<br />abline(mod1, lwd=2)<br /><br /><span class="fullpost"><br /><br /><span style="font-style:italic;"># calculate residuals and predicted values</span><br />res <- signif(residuals(mod1), 5)<br />pre <- predict(mod1)<br /><br /><span style="font-style:italic;"># plot distances between points and the regression line</span><br />segments(x, y, x, pre, col="red")<br /><br /><span style="font-style:italic;"># add labels (res values) to points</span><br />library(calibrate)<br />textxy(x, y, res, cx=0.7)<br /></code><br /><a target='_blank' href='http://img851.imageshack.us/img851/1310/plotres.jpg'><img src='http://img851.imageshack.us/img851/1310/plotres.th.jpg' border='0'/></a><br /><br /></span>Todos Logoshttp://www.blogger.com/profile/09881188152777475558noreply@blogger.com2tag:blogger.com,1999:blog-4274823366855967619.post-41470846874409958362010-11-07T11:40:00.003+01:002010-11-07T12:24:41.971+01:00R is a cool image editor!Here I present some functions I wrote to recreate some of the most common image effect available in all image editor.<br />They require the library <span style="font-weight:bold;">rimage</span>.<br />To load the image, use:<br /><br /><pre>y <span style=""><-</span> read.jpeg<span style="color: #009900;">(</span><span style="color: #0000ff;">"path"</span><span style="color: #009900;">)</span></pre><br /><br />To display the image, use:<br /><br /><pre><span style="color: #003399; font-weight: bold;">plot</span><span style="color: #009900;">(</span>y<span style="color: #009900;">)</span></pre><br /><br /><center><hr width=50%></center><br /><center><big><u>Original image</u></big></center><br /><br /><center><a target='_blank' href='http://img89.imageshack.us/img89/7343/dandg.jpg'><img src='http://img89.imageshack.us/img89/7343/dandg.th.jpg' border='0'/></a></center><br /><br /><span class="fullpost"><br /><center><hr width=50%></center><br /><center><big><u>Sepia tone</u></big></center><br /><br /><pre>rgb2sepia <span style=""><-</span> <span style="color: #003399; font-weight: bold;">function</span><span style="color: #009900;">(</span>img<span style="color: #009900;">)</span><span style="color: #009900;">{</span><br /> iRed <span style=""><-</span> img<span style="color: #009900;">[</span><span style="color: #339933;">,,</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span><span style="">*</span><span style="color: #cc66cc;">255</span><br /> iGreen <span style=""><-</span> img<span style="color: #009900;">[</span><span style="color: #339933;">,,</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">]</span><span style="">*</span><span style="color: #cc66cc;">255</span><br /> iBlue <span style=""><-</span> img<span style="color: #009900;">[</span><span style="color: #339933;">,,</span><span style="color: #cc66cc;">3</span><span style="color: #009900;">]</span><span style="">*</span><span style="color: #cc66cc;">255</span><br /> <br /> oRed <span style=""><-</span> iRed <span style="">*</span> <span style="color: #cc66cc;">.393</span> <span style="">+</span> iGreen <span style="">*</span> <span style="color: #cc66cc;">.769</span> <span style="">+</span> iBlue <span style="">*</span> <span style="color: #cc66cc;">.189</span><br /> oGreen <span style=""><-</span> iRed <span style="">*</span> <span style="color: #cc66cc;">.349</span> <span style="">+</span> iGreen <span style="">*</span> <span style="color: #cc66cc;">.686</span> <span style="">+</span> iBlue <span style="">*</span> <span style="color: #cc66cc;">.168</span><br /> oBlue <span style=""><-</span> iRed <span style="">*</span> <span style="color: #cc66cc;">.272</span> <span style="">+</span> iGreen <span style="">*</span> <span style="color: #cc66cc;">.534</span> <span style="">+</span> iBlue <span style="">*</span> <span style="color: #cc66cc;">.131</span><br /> <br /> qw <span style=""><-</span> <span style="color: #003399; font-weight: bold;">array</span><span style="color: #009900;">(</span> <span style="color: #003399; font-weight: bold;">c</span><span style="color: #009900;">(</span>oRed<span style="">/</span><span style="color: #cc66cc;">255</span> <span style="color: #339933;">,</span> oGreen<span style="">/</span><span style="color: #cc66cc;">255</span> <span style="color: #339933;">,</span> oBlue<span style="">/</span><span style="color: #cc66cc;">255</span><span style="color: #009900;">)</span><span style="color: #339933;">,</span> <span style="color: #003399; font-weight: bold;">dim</span>=<span style="color: #003399; font-weight: bold;">c</span><span style="color: #009900;">(</span><span style="color: #cc66cc;">dim(iRed)[1]</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">dim(iRed)[2]</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">3</span><span style="color: #009900;">)</span> <span style="color: #009900;">)</span><br /> <br /> imagematrix<span style="color: #009900;">(</span>qw<span style="color: #339933;">,</span> type=<span style="color: #0000ff;">"rgb"</span><span style="color: #009900;">)</span><br /><span style="color: #009900;">}</span><br /> <br /><span style="color: #003399; font-weight: bold;">plot</span><span style="color: #009900;">(</span>rgb2sepia<span style="color: #009900;">(</span>y<span style="color: #009900;">)</span><span style="color: #009900;">)</span></pre><br /><br /><center><a target='_blank' href='http://img600.imageshack.us/img600/9976/sepia.png'><img src='http://img600.imageshack.us/img600/9976/sepia.th.png' border='0'/></a></center><br /><br /><center><hr width=50%></center><br /><center><big><u>Negative</u></big></center><br /><br /><pre>rgb2neg <span style=""><-</span> <span style="color: #003399; font-weight: bold;">function</span><span style="color: #009900;">(</span>img<span style="color: #009900;">)</span><span style="color: #009900;">{</span><br /> iRed <span style=""><-</span> img<span style="color: #009900;">[</span><span style="color: #339933;">,,</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span><br /> iGreen <span style=""><-</span> img<span style="color: #009900;">[</span><span style="color: #339933;">,,</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">]</span><br /> iBlue <span style=""><-</span> img<span style="color: #009900;">[</span><span style="color: #339933;">,,</span><span style="color: #cc66cc;">3</span><span style="color: #009900;">]</span><br /> <br /> oRed <span style=""><-</span> <span style="color: #009900;">(</span><span style="color: #cc66cc;">1</span> <span style="">-</span> iRed<span style="color: #009900;">)</span><br /> oGreen <span style=""><-</span> <span style="color: #009900;">(</span><span style="color: #cc66cc;">1</span> <span style="">-</span> iGreen<span style="color: #009900;">)</span><br /> oBlue <span style=""><-</span> <span style="color: #009900;">(</span><span style="color: #cc66cc;">1</span> <span style="">-</span> iBlue<span style="color: #009900;">)</span><br /> <br /> qw <span style=""><-</span> <span style="color: #003399; font-weight: bold;">array</span><span style="color: #009900;">(</span> <span style="color: #003399; font-weight: bold;">c</span><span style="color: #009900;">(</span>oRed<span style="color: #339933;">,</span> oGreen<span style="color: #339933;">,</span> oBlue<span style="color: #009900;">)</span><span style="color: #339933;">,</span> <span style="color: #003399; font-weight: bold;">dim</span>=<span style="color: #003399; font-weight: bold;">c</span><span style="color: #009900;">(</span><span style="color: #cc66cc;">dim(iRed)[1]</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">dim(iRed)[2]</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">3</span><span style="color: #009900;">)</span> <span style="color: #009900;">)</span><br /> <br /> imagematrix<span style="color: #009900;">(</span>qw<span style="color: #339933;">,</span> type=<span style="color: #0000ff;">"rgb"</span><span style="color: #009900;">)</span><br /><span style="color: #009900;">}</span><br /> <br /><span style="color: #003399; font-weight: bold;">plot</span><span style="color: #009900;">(</span>rgb2neg<span style="color: #009900;">(</span>y<span style="color: #009900;">)</span><span style="color: #009900;">)</span></pre><br /><br /><center><a target='_blank' href='http://img269.imageshack.us/img269/6540/negn.png'><img src='http://img269.imageshack.us/img269/6540/negn.th.png' border='0'/></a></center><br /><br /><center><hr width=50%></center><br /><center><big><u>Pixelation</u></big></center><br /><br /><pre>pixmatr <span style=""><-</span> <span style="color: #003399; font-weight: bold;">function</span><span style="color: #009900;">(</span>a<span style="color: #339933;">,</span> n<span style="color: #009900;">)</span><span style="color: #009900;">{</span><br /> aa <span style=""><-</span> <span style="color: #003399; font-weight: bold;">seq</span><span style="color: #009900;">(</span><span style="color: #cc66cc;">1</span><span style="color: #339933;">,</span><span style="color: #003399; font-weight: bold;">dim</span><span style="color: #009900;">(</span>a<span style="color: #009900;">)</span><span style="color: #009900;">[</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span><span style="color: #339933;">,</span>n<span style="color: #009900;">)</span><br /> ll <span style=""><-</span> <span style="color: #003399; font-weight: bold;">seq</span><span style="color: #009900;">(</span><span style="color: #cc66cc;">1</span><span style="color: #339933;">,</span><span style="color: #003399; font-weight: bold;">dim</span><span style="color: #009900;">(</span>a<span style="color: #009900;">)</span><span style="color: #009900;">[</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">]</span><span style="color: #339933;">,</span>n<span style="color: #009900;">)</span><br /> <br /> <span style="color: #000000; font-weight: bold;">for</span><span style="color: #009900;">(</span>i <span style="color: #000000; font-weight: bold;">in</span> <span style="color: #cc66cc;">1</span><span style="">:</span><span style="color: #009900;">(</span><span style="color: #003399; font-weight: bold;">length</span><span style="color: #009900;">(</span>aa<span style="color: #009900;">)</span><span style="">-</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">)</span> <span style="color: #009900;">)</span><span style="color: #009900;">{</span><br /> <span style="color: #000000; font-weight: bold;">for</span><span style="color: #009900;">(</span>j <span style="color: #000000; font-weight: bold;">in</span> <span style="color: #cc66cc;">1</span><span style="">:</span><span style="color: #009900;">(</span><span style="color: #003399; font-weight: bold;">length</span><span style="color: #009900;">(</span>ll<span style="color: #009900;">)</span><span style="">-</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">)</span> <span style="color: #009900;">)</span><span style="color: #009900;">{</span><br /> sub1 <span style=""><-</span> a<span style="color: #009900;">[</span>aa<span style="color: #009900;">[</span>i<span style="color: #009900;">]</span><span style="">:</span><span style="color: #009900;">(</span>aa<span style="color: #009900;">[</span>i<span style="">+</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span><span style="">-</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">)</span><span style="color: #339933;">,</span>ll<span style="color: #009900;">[</span>j<span style="color: #009900;">]</span><span style="">:</span><span style="color: #009900;">(</span>ll<span style="color: #009900;">[</span>j<span style="">+</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span><span style="">-</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">)</span><span style="color: #009900;">]</span><br /> k <span style=""><-</span> <span style="color: #003399; font-weight: bold;">mean</span><span style="color: #009900;">(</span>sub1<span style="color: #009900;">)</span><br /> sub1m <span style=""><-</span> <span style="color: #003399; font-weight: bold;">matrix</span><span style="color: #009900;">(</span> <span style="color: #003399; font-weight: bold;">rep</span><span style="color: #009900;">(</span>k<span style="color: #339933;">,</span> n<span style="">*</span>n<span style="color: #009900;">)</span><span style="color: #339933;">,</span> n<span style="color: #339933;">,</span> n<span style="color: #009900;">)</span><br /> a<span style="color: #009900;">[</span>aa<span style="color: #009900;">[</span>i<span style="color: #009900;">]</span><span style="">:</span><span style="color: #009900;">(</span>aa<span style="color: #009900;">[</span>i<span style="">+</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span><span style="">-</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">)</span><span style="color: #339933;">,</span>ll<span style="color: #009900;">[</span>j<span style="color: #009900;">]</span><span style="">:</span><span style="color: #009900;">(</span>ll<span style="color: #009900;">[</span>j<span style="">+</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span><span style="">-</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">)</span><span style="color: #009900;">]</span> <span style=""><-</span> sub1m<br /> <span style="color: #009900;">}</span><br /> <span style="color: #009900;">}</span><br /> <br /> <span style="color: #000000; font-weight: bold;">for</span><span style="color: #009900;">(</span>j <span style="color: #000000; font-weight: bold;">in</span> <span style="color: #cc66cc;">1</span><span style="">:</span><span style="color: #009900;">(</span><span style="color: #003399; font-weight: bold;">length</span><span style="color: #009900;">(</span>ll<span style="color: #009900;">)</span><span style="">-</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">)</span> <span style="color: #009900;">)</span><span style="color: #009900;">{</span><br /> sub1 <span style=""><-</span> a<span style="color: #009900;">[</span><span style="color: #003399; font-weight: bold;">max</span><span style="color: #009900;">(</span>aa<span style="color: #009900;">)</span><span style="">:</span><span style="color: #003399; font-weight: bold;">dim</span><span style="color: #009900;">(</span>a<span style="color: #009900;">)</span><span style="color: #009900;">[</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span><span style="color: #339933;">,</span>ll<span style="color: #009900;">[</span>j<span style="color: #009900;">]</span><span style="">:</span><span style="color: #009900;">(</span>ll<span style="color: #009900;">[</span>j<span style="">+</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span><span style="">-</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">)</span><span style="color: #009900;">]</span><br /> k <span style=""><-</span> <span style="color: #003399; font-weight: bold;">mean</span><span style="color: #009900;">(</span>sub1<span style="color: #009900;">)</span><br /> sub1m <span style=""><-</span> <span style="color: #003399; font-weight: bold;">matrix</span><span style="color: #009900;">(</span> <span style="color: #003399; font-weight: bold;">rep</span><span style="color: #009900;">(</span>k<span style="color: #339933;">,</span> <span style="color: #003399; font-weight: bold;">nrow</span><span style="color: #009900;">(</span>sub1<span style="color: #009900;">)</span><span style="">*</span><span style="color: #003399; font-weight: bold;">ncol</span><span style="color: #009900;">(</span>sub1<span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #339933;">,</span> <span style="color: #003399; font-weight: bold;">nrow</span><span style="color: #009900;">(</span>sub1<span style="color: #009900;">)</span><span style="color: #339933;">,</span> <span style="color: #003399; font-weight: bold;">ncol</span><span style="color: #009900;">(</span>sub1<span style="color: #009900;">)</span><span style="color: #009900;">)</span><br /> a<span style="color: #009900;">[</span><span style="color: #003399; font-weight: bold;">max</span><span style="color: #009900;">(</span>aa<span style="color: #009900;">)</span><span style="">:</span><span style="color: #003399; font-weight: bold;">dim</span><span style="color: #009900;">(</span>a<span style="color: #009900;">)</span><span style="color: #009900;">[</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span><span style="color: #339933;">,</span>ll<span style="color: #009900;">[</span>j<span style="color: #009900;">]</span><span style="">:</span><span style="color: #009900;">(</span>ll<span style="color: #009900;">[</span>j<span style="">+</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span><span style="">-</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">)</span><span style="color: #009900;">]</span> <span style=""><-</span> sub1m<br /> <span style="color: #009900;">}</span><br /> <br /> <span style="color: #000000; font-weight: bold;">for</span><span style="color: #009900;">(</span>i <span style="color: #000000; font-weight: bold;">in</span> <span style="color: #cc66cc;">1</span><span style="">:</span><span style="color: #009900;">(</span><span style="color: #003399; font-weight: bold;">length</span><span style="color: #009900;">(</span>aa<span style="color: #009900;">)</span><span style="">-</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">)</span> <span style="color: #009900;">)</span><span style="color: #009900;">{</span><br /> sub1 <span style=""><-</span> a<span style="color: #009900;">[</span>aa<span style="color: #009900;">[</span>i<span style="color: #009900;">]</span><span style="">:</span><span style="color: #009900;">(</span>aa<span style="color: #009900;">[</span>i<span style="">+</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span><span style="">-</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">)</span><span style="color: #339933;">,</span><span style="color: #003399; font-weight: bold;">max</span><span style="color: #009900;">(</span>ll<span style="color: #009900;">)</span><span style="">:</span><span style="color: #003399; font-weight: bold;">dim</span><span style="color: #009900;">(</span>a<span style="color: #009900;">)</span><span style="color: #009900;">[</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">]</span><span style="color: #009900;">]</span><br /> k <span style=""><-</span> <span style="color: #003399; font-weight: bold;">mean</span><span style="color: #009900;">(</span>sub1<span style="color: #009900;">)</span><br /> sub1m <span style=""><-</span> <span style="color: #003399; font-weight: bold;">matrix</span><span style="color: #009900;">(</span> <span style="color: #003399; font-weight: bold;">rep</span><span style="color: #009900;">(</span>k<span style="color: #339933;">,</span> <span style="color: #003399; font-weight: bold;">nrow</span><span style="color: #009900;">(</span>sub1<span style="color: #009900;">)</span><span style="">*</span><span style="color: #003399; font-weight: bold;">ncol</span><span style="color: #009900;">(</span>sub1<span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #339933;">,</span> <span style="color: #003399; font-weight: bold;">nrow</span><span style="color: #009900;">(</span>sub1<span style="color: #009900;">)</span><span style="color: #339933;">,</span> <span style="color: #003399; font-weight: bold;">ncol</span><span style="color: #009900;">(</span>sub1<span style="color: #009900;">)</span><span style="color: #009900;">)</span><br /> a<span style="color: #009900;">[</span>aa<span style="color: #009900;">[</span>i<span style="color: #009900;">]</span><span style="">:</span><span style="color: #009900;">(</span>aa<span style="color: #009900;">[</span>i<span style="">+</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span><span style="">-</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">)</span><span style="color: #339933;">,</span><span style="color: #003399; font-weight: bold;">max</span><span style="color: #009900;">(</span>ll<span style="color: #009900;">)</span><span style="">:</span><span style="color: #003399; font-weight: bold;">dim</span><span style="color: #009900;">(</span>a<span style="color: #009900;">)</span><span style="color: #009900;">[</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">]</span><span style="color: #009900;">]</span> <span style=""><-</span> sub1m<br /> <span style="color: #009900;">}</span><br /> <br /> sub1 <span style=""><-</span> a<span style="color: #009900;">[</span><span style="color: #003399; font-weight: bold;">max</span><span style="color: #009900;">(</span>aa<span style="color: #009900;">)</span><span style="">:</span><span style="color: #003399; font-weight: bold;">dim</span><span style="color: #009900;">(</span>a<span style="color: #009900;">)</span><span style="color: #009900;">[</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span><span style="color: #339933;">,</span> <span style="color: #003399; font-weight: bold;">max</span><span style="color: #009900;">(</span>ll<span style="color: #009900;">)</span><span style="">:</span><span style="color: #003399; font-weight: bold;">dim</span><span style="color: #009900;">(</span>a<span style="color: #009900;">)</span><span style="color: #009900;">[</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">]</span><span style="color: #009900;">]</span><br /> k <span style=""><-</span> <span style="color: #003399; font-weight: bold;">mean</span><span style="color: #009900;">(</span>sub1<span style="color: #009900;">)</span><br /> sub1m <span style=""><-</span> <span style="color: #003399; font-weight: bold;">matrix</span><span style="color: #009900;">(</span> <span style="color: #003399; font-weight: bold;">rep</span><span style="color: #009900;">(</span>k<span style="color: #339933;">,</span> <span style="color: #003399; font-weight: bold;">nrow</span><span style="color: #009900;">(</span>sub1<span style="color: #009900;">)</span><span style="">*</span><span style="color: #003399; font-weight: bold;">ncol</span><span style="color: #009900;">(</span>sub1<span style="color: #009900;">)</span><span style="color: #009900;">)</span><span style="color: #339933;">,</span> <span style="color: #003399; font-weight: bold;">nrow</span><span style="color: #009900;">(</span>sub1<span style="color: #009900;">)</span><span style="color: #339933;">,</span> <span style="color: #003399; font-weight: bold;">ncol</span><span style="color: #009900;">(</span>sub1<span style="color: #009900;">)</span><span style="color: #009900;">)</span><br /> a<span style="color: #009900;">[</span><span style="color: #003399; font-weight: bold;">max</span><span style="color: #009900;">(</span>aa<span style="color: #009900;">)</span><span style="">:</span><span style="color: #003399; font-weight: bold;">dim</span><span style="color: #009900;">(</span>a<span style="color: #009900;">)</span><span style="color: #009900;">[</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span><span style="color: #339933;">,</span> <span style="color: #003399; font-weight: bold;">max</span><span style="color: #009900;">(</span>ll<span style="color: #009900;">)</span><span style="">:</span><span style="color: #003399; font-weight: bold;">dim</span><span style="color: #009900;">(</span>a<span style="color: #009900;">)</span><span style="color: #009900;">[</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">]</span><span style="color: #009900;">]</span> <span style=""><-</span> sub1m<br /> <br />a<br /><span style="color: #009900;">}</span><br /> <br />rgb2pix <span style=""><-</span> <span style="color: #003399; font-weight: bold;">function</span><span style="color: #009900;">(</span>img<span style="color: #339933;">,</span>n<span style="color: #009900;">)</span><span style="color: #009900;">{</span><br /> iRed <span style=""><-</span> img<span style="color: #009900;">[</span><span style="color: #339933;">,,</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span><span style="">*</span><span style="color: #cc66cc;">255</span><br /> iGreen <span style=""><-</span> img<span style="color: #009900;">[</span><span style="color: #339933;">,,</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">]</span><span style="">*</span><span style="color: #cc66cc;">255</span><br /> iBlue <span style=""><-</span> img<span style="color: #009900;">[</span><span style="color: #339933;">,,</span><span style="color: #cc66cc;">3</span><span style="color: #009900;">]</span><span style="">*</span><span style="color: #cc66cc;">255</span><br /> <br /> oRed <span style=""><-</span> pixmatr<span style="color: #009900;">(</span>iRed<span style="color: #339933;">,</span>n<span style="color: #009900;">)</span><br /> oGreen <span style=""><-</span> pixmatr<span style="color: #009900;">(</span>iGreen<span style="color: #339933;">,</span>n<span style="color: #009900;">)</span><br /> oBlue <span style=""><-</span> pixmatr<span style="color: #009900;">(</span>iBlue<span style="color: #339933;">,</span>n<span style="color: #009900;">)</span><br /> <br /> qw <span style=""><-</span> <span style="color: #003399; font-weight: bold;">array</span><span style="color: #009900;">(</span> <span style="color: #003399; font-weight: bold;">c</span><span style="color: #009900;">(</span>oRed<span style="">/</span><span style="color: #cc66cc;">255</span> <span style="color: #339933;">,</span> oGreen<span style="">/</span><span style="color: #cc66cc;">255</span> <span style="color: #339933;">,</span> oBlue<span style="">/</span><span style="color: #cc66cc;">255</span><span style="color: #009900;">)</span><span style="color: #339933;">,</span> <span style="color: #003399; font-weight: bold;">dim</span>=<span style="color: #003399; font-weight: bold;">c</span><span style="color: #009900;">(</span><span style="color: #cc66cc;">dim(iRed)[1]</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">dim(iRed)[2]</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">3</span><span style="color: #009900;">)</span> <span style="color: #009900;">)</span><br /> <br /> imagematrix<span style="color: #009900;">(</span>qw<span style="color: #339933;">,</span> type=<span style="color: #0000ff;">"rgb"</span><span style="color: #009900;">)</span><br /><span style="color: #009900;">}</span><br /> <br /><span style="color: #003399; font-weight: bold;">plot</span><span style="color: #009900;">(</span>rgb2pix<span style="color: #009900;">(</span>y<span style="color: #339933;">,</span> <span style="color: #cc66cc;">6</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span><br /><span style="color: #003399; font-weight: bold;">plot</span><span style="color: #009900;">(</span>rgb2pix<span style="color: #009900;">(</span>y<span style="color: #339933;">,</span> <span style="color: #cc66cc;">10</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span></pre><br /><br /><center><a href="http://img29.imageshack.us/img29/9105/pix1h.png" target="_blank"><img src="http://img29.imageshack.us/img29/9105/pix1h.th.png" border="0"/></a></center><br /><center><a href="http://img530.imageshack.us/img530/3725/pix2.png" target="_blank"><img src="http://img530.imageshack.us/img530/3725/pix2.th.png" border="0"/></a></center><br /><br /><center><hr width=50%></center><br /><center><big><u>Remove red</u></big></center><br /><br /><pre>rgb2blu <span style=""><-</span> <span style="color: #003399; font-weight: bold;">function</span><span style="color: #009900;">(</span>img<span style="color: #009900;">)</span><span style="color: #009900;">{</span><br /> iRed <span style=""><-</span> img<span style="color: #009900;">[</span><span style="color: #339933;">,,</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span><br /> iGreen <span style=""><-</span> img<span style="color: #009900;">[</span><span style="color: #339933;">,,</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">]</span><br /> iBlue <span style=""><-</span> img<span style="color: #009900;">[</span><span style="color: #339933;">,,</span><span style="color: #cc66cc;">3</span><span style="color: #009900;">]</span><br /> <br /> oRed <span style=""><-</span> <span style="color: #003399; font-weight: bold;">matrix</span><span style="color: #009900;">(</span><span style="color: #cc66cc;">0</span><span style="color: #339933;">,</span> <span style="color: #003399; font-weight: bold;">dim</span><span style="color: #009900;">(</span>iRed<span style="color: #009900;">)</span><span style="color: #009900;">[</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span><span style="color: #339933;">,</span> <span style="color: #003399; font-weight: bold;">dim</span><span style="color: #009900;">(</span>iRed<span style="color: #009900;">)</span><span style="color: #009900;">[</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">]</span><span style="color: #009900;">)</span><br /> oGreen <span style=""><-</span> iGreen<br /> oBlue <span style=""><-</span> iBlue<br /> <br /> qw <span style=""><-</span> <span style="color: #003399; font-weight: bold;">array</span><span style="color: #009900;">(</span> <span style="color: #003399; font-weight: bold;">c</span><span style="color: #009900;">(</span>oRed<span style="color: #339933;">,</span> oGreen<span style="color: #339933;">,</span> oBlue<span style="color: #009900;">)</span><span style="color: #339933;">,</span> <span style="color: #003399; font-weight: bold;">dim</span>=<span style="color: #003399; font-weight: bold;">c</span><span style="color: #009900;">(</span><span style="color: #cc66cc;">dim(iRed)[1]</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">dim(iRed)[2]</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">3</span><span style="color: #009900;">)</span> <span style="color: #009900;">)</span><br /> <br /> imagematrix<span style="color: #009900;">(</span>qw<span style="color: #339933;">,</span> type=<span style="color: #0000ff;">"rgb"</span><span style="color: #009900;">)</span><br /><span style="color: #009900;">}</span><br /> <br /><span style="color: #003399; font-weight: bold;">plot</span><span style="color: #009900;">(</span>rgb2blu<span style="color: #009900;">(</span>y<span style="color: #009900;">)</span><span style="color: #009900;">)</span></pre><br /><br /><center><a target='_blank' href='http://img44.imageshack.us/img44/3697/blupa.png'><img src='http://img44.imageshack.us/img44/3697/blupa.th.png' border='0'/></a></center><br /><br /><center><hr width=50%></center><br /><center><big><u>Remove green</u></big></center><br /><br /><pre>rgb2vio <span style=""><-</span> <span style="color: #003399; font-weight: bold;">function</span><span style="color: #009900;">(</span>img<span style="color: #009900;">)</span><span style="color: #009900;">{</span><br /> iRed <span style=""><-</span> img<span style="color: #009900;">[</span><span style="color: #339933;">,,</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span><br /> iGreen <span style=""><-</span> img<span style="color: #009900;">[</span><span style="color: #339933;">,,</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">]</span><br /> iBlue <span style=""><-</span> img<span style="color: #009900;">[</span><span style="color: #339933;">,,</span><span style="color: #cc66cc;">3</span><span style="color: #009900;">]</span><br /> <br /> oRed <span style=""><-</span> iRed<br /> oGreen <span style=""><-</span> <span style="color: #003399; font-weight: bold;">matrix</span><span style="color: #009900;">(</span><span style="color: #cc66cc;">0</span><span style="color: #339933;">,</span> <span style="color: #003399; font-weight: bold;">dim</span><span style="color: #009900;">(</span>iRed<span style="color: #009900;">)</span><span style="color: #009900;">[</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span><span style="color: #339933;">,</span> <span style="color: #003399; font-weight: bold;">dim</span><span style="color: #009900;">(</span>iRed<span style="color: #009900;">)</span><span style="color: #009900;">[</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">]</span><span style="color: #009900;">)</span><br /> oBlue <span style=""><-</span> iBlue<br /> <br /> qw <span style=""><-</span> <span style="color: #003399; font-weight: bold;">array</span><span style="color: #009900;">(</span> <span style="color: #003399; font-weight: bold;">c</span><span style="color: #009900;">(</span>oRed<span style="color: #339933;">,</span> oGreen<span style="color: #339933;">,</span> oBlue<span style="color: #009900;">)</span><span style="color: #339933;">,</span> <span style="color: #003399; font-weight: bold;">dim</span>=<span style="color: #003399; font-weight: bold;">c</span><span style="color: #009900;">(</span><span style="color: #cc66cc;">dim(iRed)[1]</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">dim(iRed)[2]</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">3</span><span style="color: #009900;">)</span> <span style="color: #009900;">)</span><br /> <br /> imagematrix<span style="color: #009900;">(</span>qw<span style="color: #339933;">,</span> type=<span style="color: #0000ff;">"rgb"</span><span style="color: #009900;">)</span><br /><span style="color: #009900;">}</span><br /> <br /><span style="color: #003399; font-weight: bold;">plot</span><span style="color: #009900;">(</span>rgb2vio<span style="color: #009900;">(</span>y<span style="color: #009900;">)</span><span style="color: #009900;">)</span></pre><br /><br /><center><a target='_blank' href='http://img801.imageshack.us/img801/9264/vios.png'><img src='http://img801.imageshack.us/img801/9264/vios.th.png' border='0'/></a></center><br /><br /><center><hr width=50%></center><br /><center><big><u>Remove blue</u></big></center><br /><br /><pre>rgb2yel <span style=""><-</span> <span style="color: #003399; font-weight: bold;">function</span><span style="color: #009900;">(</span>img<span style="color: #009900;">)</span><span style="color: #009900;">{</span><br /> iRed <span style=""><-</span> img<span style="color: #009900;">[</span><span style="color: #339933;">,,</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span><br /> iGreen <span style=""><-</span> img<span style="color: #009900;">[</span><span style="color: #339933;">,,</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">]</span><br /> iBlue <span style=""><-</span> img<span style="color: #009900;">[</span><span style="color: #339933;">,,</span><span style="color: #cc66cc;">3</span><span style="color: #009900;">]</span><br /> <br /> oRed <span style=""><-</span> iRed<br /> oGreen <span style=""><-</span> iGreen<br /> oBlue <span style=""><-</span> <span style="color: #003399; font-weight: bold;">matrix</span><span style="color: #009900;">(</span><span style="color: #cc66cc;">0</span><span style="color: #339933;">,</span> <span style="color: #003399; font-weight: bold;">dim</span><span style="color: #009900;">(</span>iRed<span style="color: #009900;">)</span><span style="color: #009900;">[</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span><span style="color: #339933;">,</span> <span style="color: #003399; font-weight: bold;">dim</span><span style="color: #009900;">(</span>iRed<span style="color: #009900;">)</span><span style="color: #009900;">[</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">]</span><span style="color: #009900;">)</span><br /> <br /> qw <span style=""><-</span> <span style="color: #003399; font-weight: bold;">array</span><span style="color: #009900;">(</span> <span style="color: #003399; font-weight: bold;">c</span><span style="color: #009900;">(</span>oRed<span style="color: #339933;">,</span> oGreen<span style="color: #339933;">,</span> oBlue<span style="color: #009900;">)</span><span style="color: #339933;">,</span> <span style="color: #003399; font-weight: bold;">dim</span>=<span style="color: #003399; font-weight: bold;">c</span><span style="color: #009900;">(</span><span style="color: #cc66cc;">dim(iRed)[1]</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">dim(iRed)[2]</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">3</span><span style="color: #009900;">)</span> <span style="color: #009900;">)</span><br /> <br /> imagematrix<span style="color: #009900;">(</span>qw<span style="color: #339933;">,</span> type=<span style="color: #0000ff;">"rgb"</span><span style="color: #009900;">)</span><br /><span style="color: #009900;">}</span><br /> <br /><span style="color: #003399; font-weight: bold;">plot</span><span style="color: #009900;">(</span>rgb2yel<span style="color: #009900;">(</span>y<span style="color: #009900;">)</span><span style="color: #009900;">)</span></pre><br /><br /><center><a href="http://img59.imageshack.us/img59/4473/yel.png" target="_blank"><img src="http://img59.imageshack.us/img59/4473/yel.th.png" border="0"/></a></center><br /><br /><center><hr width=50%></center><br /><center><big><u>Adjust brightness</u></big></center><br /><br /><pre>rgb2bri <span style=""><-</span> <span style="color: #003399; font-weight: bold;">function</span><span style="color: #009900;">(</span>img<span style="color: #339933;">,</span> n<span style="color: #009900;">)</span><span style="color: #009900;">{</span><br /> iRed <span style=""><-</span> img<span style="color: #009900;">[</span><span style="color: #339933;">,,</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span><br /> iGreen <span style=""><-</span> img<span style="color: #009900;">[</span><span style="color: #339933;">,,</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">]</span><br /> iBlue <span style=""><-</span> img<span style="color: #009900;">[</span><span style="color: #339933;">,,</span><span style="color: #cc66cc;">3</span><span style="color: #009900;">]</span><br /> <br /> oRed <span style=""><-</span> iRed <span style="">+</span> <span style="color: #009900;">(</span>iRed <span style="">*</span> n<span style="color: #009900;">)</span><br /> oGreen <span style=""><-</span> iGreen <span style="">+</span> <span style="color: #009900;">(</span>iGreen <span style="">*</span> n<span style="color: #009900;">)</span><br /> oBlue <span style=""><-</span> iBlue <span style="">+</span> <span style="color: #009900;">(</span>iBlue <span style="">*</span> n<span style="color: #009900;">)</span><br /> <br /> qw <span style=""><-</span> <span style="color: #003399; font-weight: bold;">array</span><span style="color: #009900;">(</span> <span style="color: #003399; font-weight: bold;">c</span><span style="color: #009900;">(</span>oRed<span style="color: #339933;">,</span> oGreen<span style="color: #339933;">,</span> oBlue<span style="color: #009900;">)</span><span style="color: #339933;">,</span> <span style="color: #003399; font-weight: bold;">dim</span>=<span style="color: #003399; font-weight: bold;">c</span><span style="color: #009900;">(</span><span style="color: #cc66cc;">dim(iRed)[1]</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">dim(iRed)[2]</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">3</span><span style="color: #009900;">)</span> <span style="color: #009900;">)</span><br /> <br /> imagematrix<span style="color: #009900;">(</span>qw<span style="color: #339933;">,</span> type=<span style="color: #0000ff;">"rgb"</span><span style="color: #009900;">)</span><br /><span style="color: #009900;">}</span><br /> <br /><span style="color: #003399; font-weight: bold;">plot</span><span style="color: #009900;">(</span>rgb2bri<span style="color: #009900;">(</span>y<span style="color: #339933;">,</span> <span style="">+</span><span style="color: #cc66cc;">0.5</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span><br /><span style="color: #003399; font-weight: bold;">plot</span><span style="color: #009900;">(</span>rgb2bri<span style="color: #009900;">(</span>y<span style="color: #339933;">,</span> <span style="">-</span><span style="color: #cc66cc;">0.5</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span></pre><br /><br /><center><a href="http://img408.imageshack.us/img408/9818/bri1.png" target="_blank"><img src="http://img408.imageshack.us/img408/9818/bri1.th.png" border="0"/></a></center><br /><center><a href="http://img683.imageshack.us/img683/5796/bri2.png" target="_blank"><img src="http://img683.imageshack.us/img683/5796/bri2.th.png" border="0"/></a></center><br /><br /><center><hr width=50%></center><br /><center><big><u>Truncate colors into bands (posterize)</u></big></center><br /><br /><pre>rgb2ban <span style=""><-</span> <span style="color: #003399; font-weight: bold;">function</span><span style="color: #009900;">(</span>img<span style="color: #339933;">,</span> n<span style="color: #009900;">)</span><span style="color: #009900;">{</span><br /> iRed <span style=""><-</span> img<span style="color: #009900;">[</span><span style="color: #339933;">,,</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span><span style="">*</span><span style="color: #cc66cc;">255</span><br /> iGreen <span style=""><-</span> img<span style="color: #009900;">[</span><span style="color: #339933;">,,</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">]</span><span style="">*</span><span style="color: #cc66cc;">255</span><br /> iBlue <span style=""><-</span> img<span style="color: #009900;">[</span><span style="color: #339933;">,,</span><span style="color: #cc66cc;">3</span><span style="color: #009900;">]</span><span style="">*</span><span style="color: #cc66cc;">255</span><br /> <br /> band_size <span style=""><-</span> <span style="color: #003399; font-weight: bold;">trunc</span><span style="color: #009900;">(</span><span style="color: #cc66cc;">255</span><span style="">/</span>n<span style="color: #009900;">)</span><br /> <br /> oRed <span style=""><-</span> band_size <span style="">*</span> <span style="color: #003399; font-weight: bold;">trunc</span><span style="color: #009900;">(</span>iRed <span style="">/</span> band_size<span style="color: #009900;">)</span><br /> oGreen <span style=""><-</span> band_size <span style="">*</span> <span style="color: #003399; font-weight: bold;">trunc</span><span style="color: #009900;">(</span>iGreen <span style="">/</span> band_size<span style="color: #009900;">)</span><br /> oBlue <span style=""><-</span> band_size <span style="">*</span> <span style="color: #003399; font-weight: bold;">trunc</span><span style="color: #009900;">(</span>iBlue <span style="">/</span> band_size<span style="color: #009900;">)</span><br /> <br /> qw <span style=""><-</span> <span style="color: #003399; font-weight: bold;">array</span><span style="color: #009900;">(</span> <span style="color: #003399; font-weight: bold;">c</span><span style="color: #009900;">(</span>oRed<span style="">/</span><span style="color: #cc66cc;">255</span><span style="color: #339933;">,</span> oGreen<span style="">/</span><span style="color: #cc66cc;">255</span><span style="color: #339933;">,</span> oBlue<span style="">/</span><span style="color: #cc66cc;">255</span><span style="color: #009900;">)</span><span style="color: #339933;">,</span> <span style="color: #003399; font-weight: bold;">dim</span>=<span style="color: #003399; font-weight: bold;">c</span><span style="color: #009900;">(</span><span style="color: #cc66cc;">dim(iRed)[1]</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">dim(iRed)[2]</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">3</span><span style="color: #009900;">)</span> <span style="color: #009900;">)</span><br /> <br /> imagematrix<span style="color: #009900;">(</span>qw<span style="color: #339933;">,</span> type=<span style="color: #0000ff;">"rgb"</span><span style="color: #009900;">)</span><br /><span style="color: #009900;">}</span><br /> <br /><span style="color: #003399; font-weight: bold;">plot</span><span style="color: #009900;">(</span>rgb2ban<span style="color: #009900;">(</span>y<span style="color: #339933;">,</span> <span style="color: #cc66cc;">5</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span><br /><span style="color: #003399; font-weight: bold;">plot</span><span style="color: #009900;">(</span>rgb2ban<span style="color: #009900;">(</span>y<span style="color: #339933;">,</span> <span style="color: #cc66cc;">10</span><span style="color: #009900;">)</span><span style="color: #009900;">)</span></pre><br /><br /><center><a href="http://img263.imageshack.us/img263/2084/ban1b.png/" target="_blank"><img src="http://img263.imageshack.us/img263/2084/ban1b.th.png" border="0"/></a></center><br /><center><a href="http://img574.imageshack.us/img574/4012/ban2.png" target="_blank"><img src="http://img574.imageshack.us/img574/4012/ban2.th.png" border="0"/></a></center><br /><br /><center><hr width=50%></center><br /><center><big><u>Solarize</u></big></center><br /><br /><pre>rgb2sol <span style=""><-</span> <span style="color: #003399; font-weight: bold;">function</span><span style="color: #009900;">(</span>img<span style="color: #009900;">)</span><span style="color: #009900;">{</span><br /> iRed <span style=""><-</span> img<span style="color: #009900;">[</span><span style="color: #339933;">,,</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span><span style="">*</span><span style="color: #cc66cc;">255</span><br /> iGreen <span style=""><-</span> img<span style="color: #009900;">[</span><span style="color: #339933;">,,</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">]</span><span style="">*</span><span style="color: #cc66cc;">255</span><br /> iBlue <span style=""><-</span> img<span style="color: #009900;">[</span><span style="color: #339933;">,,</span><span style="color: #cc66cc;">3</span><span style="color: #009900;">]</span><span style="">*</span><span style="color: #cc66cc;">255</span><br /> <br /> <span style="color: #000000; font-weight: bold;">for</span><span style="color: #009900;">(</span>i <span style="color: #000000; font-weight: bold;">in</span> <span style="color: #cc66cc;">1</span><span style="">:</span><span style="color: #003399; font-weight: bold;">dim</span><span style="color: #009900;">(</span>iRed<span style="color: #009900;">)</span><span style="color: #009900;">[</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span><span style="color: #009900;">)</span><span style="color: #009900;">{</span><br /> <span style="color: #000000; font-weight: bold;">for</span><span style="color: #009900;">(</span>j <span style="color: #000000; font-weight: bold;">in</span> <span style="color: #cc66cc;">1</span><span style="">:</span><span style="color: #003399; font-weight: bold;">dim</span><span style="color: #009900;">(</span>iRed<span style="color: #009900;">)</span><span style="color: #009900;">[</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">]</span><span style="color: #009900;">)</span><span style="color: #009900;">{</span><br /> <span style="color: #000000; font-weight: bold;">if</span><span style="color: #009900;">(</span>iRed<span style="color: #009900;">[</span>i<span style="color: #339933;">,</span>j<span style="color: #009900;">]</span><span style=""><</span><span style="color: #cc66cc;">128</span><span style="color: #009900;">)</span> iRed<span style="color: #009900;">[</span>i<span style="color: #339933;">,</span>j<span style="color: #009900;">]</span> <span style=""><-</span> <span style="color: #cc66cc;">255</span><span style="">-</span><span style="color: #cc66cc;">2</span><span style="">*</span>iRed<span style="color: #009900;">[</span>i<span style="color: #339933;">,</span>j<span style="color: #009900;">]</span><br /> <span style="color: #000000; font-weight: bold;">else</span> iRed<span style="color: #009900;">[</span>i<span style="color: #339933;">,</span>j<span style="color: #009900;">]</span> <span style=""><-</span> <span style="color: #cc66cc;">2</span><span style="">*</span><span style="color: #009900;">(</span>iRed<span style="color: #009900;">[</span>i<span style="color: #339933;">,</span>j<span style="color: #009900;">]</span><span style="">-</span><span style="color: #cc66cc;">128</span><span style="color: #009900;">)</span><br /> <span style="color: #009900;">}</span><br /> <span style="color: #009900;">}</span><br /> <br /> <span style="color: #000000; font-weight: bold;">for</span><span style="color: #009900;">(</span>i <span style="color: #000000; font-weight: bold;">in</span> <span style="color: #cc66cc;">1</span><span style="">:</span><span style="color: #003399; font-weight: bold;">dim</span><span style="color: #009900;">(</span>iGreen<span style="color: #009900;">)</span><span style="color: #009900;">[</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span><span style="color: #009900;">)</span><span style="color: #009900;">{</span><br /> <span style="color: #000000; font-weight: bold;">for</span><span style="color: #009900;">(</span>j <span style="color: #000000; font-weight: bold;">in</span> <span style="color: #cc66cc;">1</span><span style="">:</span><span style="color: #003399; font-weight: bold;">dim</span><span style="color: #009900;">(</span>iGreen<span style="color: #009900;">)</span><span style="color: #009900;">[</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">]</span><span style="color: #009900;">)</span><span style="color: #009900;">{</span><br /> <span style="color: #000000; font-weight: bold;">if</span><span style="color: #009900;">(</span>iGreen<span style="color: #009900;">[</span>i<span style="color: #339933;">,</span>j<span style="color: #009900;">]</span><span style=""><</span><span style="color: #cc66cc;">128</span><span style="color: #009900;">)</span> iGreen<span style="color: #009900;">[</span>i<span style="color: #339933;">,</span>j<span style="color: #009900;">]</span> <span style=""><-</span> <span style="color: #cc66cc;">255</span><span style="">-</span><span style="color: #cc66cc;">2</span><span style="">*</span>iGreen<span style="color: #009900;">[</span>i<span style="color: #339933;">,</span>j<span style="color: #009900;">]</span><br /> <span style="color: #000000; font-weight: bold;">else</span> iGreen<span style="color: #009900;">[</span>i<span style="color: #339933;">,</span>j<span style="color: #009900;">]</span> <span style=""><-</span> <span style="color: #cc66cc;">2</span><span style="">*</span><span style="color: #009900;">(</span>iGreen<span style="color: #009900;">[</span>i<span style="color: #339933;">,</span>j<span style="color: #009900;">]</span><span style="">-</span><span style="color: #cc66cc;">128</span><span style="color: #009900;">)</span><br /> <span style="color: #009900;">}</span><br /> <span style="color: #009900;">}</span><br /> <br /> <span style="color: #000000; font-weight: bold;">for</span><span style="color: #009900;">(</span>i <span style="color: #000000; font-weight: bold;">in</span> <span style="color: #cc66cc;">1</span><span style="">:</span><span style="color: #003399; font-weight: bold;">dim</span><span style="color: #009900;">(</span>iBlue<span style="color: #009900;">)</span><span style="color: #009900;">[</span><span style="color: #cc66cc;">1</span><span style="color: #009900;">]</span><span style="color: #009900;">)</span><span style="color: #009900;">{</span><br /> <span style="color: #000000; font-weight: bold;">for</span><span style="color: #009900;">(</span>j <span style="color: #000000; font-weight: bold;">in</span> <span style="color: #cc66cc;">1</span><span style="">:</span><span style="color: #003399; font-weight: bold;">dim</span><span style="color: #009900;">(</span>iBlue<span style="color: #009900;">)</span><span style="color: #009900;">[</span><span style="color: #cc66cc;">2</span><span style="color: #009900;">]</span><span style="color: #009900;">)</span><span style="color: #009900;">{</span><br /> <span style="color: #000000; font-weight: bold;">if</span><span style="color: #009900;">(</span>iBlue<span style="color: #009900;">[</span>i<span style="color: #339933;">,</span>j<span style="color: #009900;">]</span><span style=""><</span><span style="color: #cc66cc;">128</span><span style="color: #009900;">)</span> iBlue<span style="color: #009900;">[</span>i<span style="color: #339933;">,</span>j<span style="color: #009900;">]</span> <span style=""><-</span> <span style="color: #cc66cc;">255</span><span style="">-</span><span style="color: #cc66cc;">2</span><span style="">*</span>iBlue<span style="color: #009900;">[</span>i<span style="color: #339933;">,</span>j<span style="color: #009900;">]</span><br /> <span style="color: #000000; font-weight: bold;">else</span> iBlue<span style="color: #009900;">[</span>i<span style="color: #339933;">,</span>j<span style="color: #009900;">]</span> <span style=""><-</span> <span style="color: #cc66cc;">2</span><span style="">*</span><span style="color: #009900;">(</span>iBlue<span style="color: #009900;">[</span>i<span style="color: #339933;">,</span>j<span style="color: #009900;">]</span><span style="">-</span><span style="color: #cc66cc;">128</span><span style="color: #009900;">)</span><br /> <span style="color: #009900;">}</span><br /> <span style="color: #009900;">}</span><br /> <br /> qw <span style=""><-</span> <span style="color: #003399; font-weight: bold;">array</span><span style="color: #009900;">(</span> <span style="color: #003399; font-weight: bold;">c</span><span style="color: #009900;">(</span>iRed<span style="">/</span><span style="color: #cc66cc;">255</span><span style="color: #339933;">,</span> iGreen<span style="">/</span><span style="color: #cc66cc;">255</span><span style="color: #339933;">,</span> iBlue<span style="">/</span><span style="color: #cc66cc;">255</span><span style="color: #009900;">)</span><span style="color: #339933;">,</span> <span style="color: #003399; font-weight: bold;">dim</span>=<span style="color: #003399; font-weight: bold;">c</span><span style="color: #009900;">(</span><span style="color: #cc66cc;">dim(iRed)[1]</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">dim(iRed)[2]</span><span style="color: #339933;">,</span><span style="color: #cc66cc;">3</span><span style="color: #009900;">)</span> <span style="color: #009900;">)</span><br /> <br /> imagematrix<span style="color: #009900;">(</span>qw<span style="color: #339933;">,</span> type=<span style="color: #0000ff;">"rgb"</span><span style="color: #009900;">)</span><br /><span style="color: #009900;">}</span><br /> <br /><span style="color: #003399; font-weight: bold;">plot</span><span style="color: #009900;">(</span>rgb2sol<span style="color: #009900;">(</span>y<span style="color: #009900;">)</span><span style="color: #009900;">)</span></pre><br /><br /><center><a href="http://img602.imageshack.us/img602/5332/sol.png" target="_blank"><img src="http://img602.imageshack.us/img602/5332/sol.th.png" border="0"/></a></center><br /><br /></span>Todos Logoshttp://www.blogger.com/profile/09881188152777475558noreply@blogger.com4tag:blogger.com,1999:blog-4274823366855967619.post-57223422366109444412010-10-19T20:13:00.002+02:002010-10-19T20:18:35.622+02:00Fast matrix inversionVery similar to what has been done to create a function to perform fast multiplication of large matrices using the Strassen algorithm (see <a href="http://statistic-on-air.blogspot.com/2010/10/fast-matrix-multiplication-in-r.html">previous post</a>), now we write the functions to quickly calculate the <a href="http://en.wikipedia.org/wiki/Invertible_matrix" target="_blank">inverse of a matrix</a>.<br /><br /><span class="fullpost">To avoid rewriting pages and pages of comments and formulas, as I did for matrix multiplication, this time I'll show you directly the code of the function (the reasoning behind it is quite similar). Please, copy and paste all the code in an external editor to see it properly.<br /><br /> <div style="margin-bottom: 2px;">Function <code style="color: rgb(0, 0, 0);">strassenInv(A)</code><br /> <div style="margin-top: 5px; text-align: center;"><input value="Show" style="margin-top: 5px; width: 60px; font-size: 10px;" onclick="if (this.parentNode.parentNode.getElementsByTagName('div')[1].getElementsByTagName('div')[0].style.display != '') { this.parentNode.parentNode.getElementsByTagName('div')[1].getElementsByTagName('div')[0].style.display = ''; this.innerText = ''; this.value = 'Hide'; } else { this.parentNode.parentNode.getElementsByTagName('div')[1].getElementsByTagName('div')[0].style.display = 'none'; this.innerText = ''; this.value = 'Show'; }" type="button"> </div><br /> <div style="border: 1px inset ; margi : 0px; padding: 6px;"><div style="display: none;"><br /><pre><br />strassenInv <- function(A){<br /><br /> div4 <- function(A, r){<br /> A <- list(A)<br /> A11 <- A[[1]][1:(r/2),1:(r/2)]<br /> A12 <- A[[1]][1:(r/2),(r/2+1):r]<br /> A21 <- A[[1]][(r/2+1):r,1:(r/2)]<br /> A22 <- A[[1]][(r/2+1):r,(r/2+1):r]<br /> A <- list(X11=A11, X12=A12, X21=A21, X22=A22)<br /> return(A)<br /> }<br /><br /> if (nrow(A) != ncol(A)) <br /> { stop("only square matrices can be inverted") }<br /><br /> is.wholenumber <-<br /> function(x, tol = .Machine$double.eps^0.5) abs(x - round(x)) < tol<br /><br /> if ( (is.wholenumber(log(nrow(A), 2)) != TRUE) || (is.wholenumber(log(ncol(A), 2)) != TRUE) )<br /> { stop("only square matrices of dimension 2^k * 2^k can be inverted with Strassen method") }<br /><br /> A <- div4(A, dim(A)[1])<br /><br /> R1 <- solve(A$X11)<br /> R2 <- A$X21 %*% R1<br /> R3 <- R1 %*% A$X12<br /> R4 <- A$X21 %*% R3<br /> R5 <- R4 - A$X22<br /> R6 <- solve(R5)<br /> C12 <- R3 %*% R6<br /> C21 <- R6 %*% R2<br /> R7 <- R3 %*% C21<br /> C11 <- R1 - R7<br /> C22 <- -R6<br /> <br /> C <- rbind(cbind(C11,C12), cbind(C21,C22))<br /><br /> return(C)<br />}</pre><br /></div></div></div><br /><br /> <div style="margin-bottom: 2px;">Function <code style="color: rgb(0, 0, 0);">strassenInv2(A)</code><br /> <div style="margin-top: 5px; text-align: center;"><input value="Show" style="margin-top: 5px; width: 60px; font-size: 10px;" onclick="if (this.parentNode.parentNode.getElementsByTagName('div')[1].getElementsByTagName('div')[0].style.display != '') { this.parentNode.parentNode.getElementsByTagName('div')[1].getElementsByTagName('div')[0].style.display = ''; this.innerText = ''; this.value = 'Hide'; } else { this.parentNode.parentNode.getElementsByTagName('div')[1].getElementsByTagName('div')[0].style.display = 'none'; this.innerText = ''; this.value = 'Show'; }" type="button"> </div><br /> <div style="border: 1px inset ; margi : 0px; padding: 6px;"><div style="display: none;"><br /><pre><br />strassenInv2 <- function(A){<br /><br /> div4 <- function(A, r){<br /> A <- list(A)<br /> A11 <- A[[1]][1:(r/2),1:(r/2)]<br /> A12 <- A[[1]][1:(r/2),(r/2+1):r]<br /> A21 <- A[[1]][(r/2+1):r,1:(r/2)]<br /> A22 <- A[[1]][(r/2+1):r,(r/2+1):r]<br /> A <- list(X11=A11, X12=A12, X21=A21, X22=A22)<br /> return(A)<br /> }<br /><br /> strassen <- function(A, B){<br /> A <- div4(A, dim(A)[1])<br /> B <- div4(B, dim(B)[1])<br /> M1 <- (A$X11+A$X22) %*% (B$X11+B$X22)<br /> M2 <- (A$X21+A$X22) %*% B$X11<br /> M3 <- A$X11 %*% (B$X12-B$X22)<br /> M4 <- A$X22 %*% (B$X21-B$X11)<br /> M5 <- (A$X11+A$X12) %*% B$X22<br /> M6 <- (A$X21-A$X11) %*% (B$X11+B$X12)<br /> M7 <- (A$X12-A$X22) %*% (B$X21+B$X22)<br /><br /> C11 <- M1+M4-M5+M7<br /> C12 <- M3+M5<br /> C21 <- M2+M4<br /> C22 <- M1-M2+M3+M6<br /> <br /> C <- rbind(cbind(C11,C12), cbind(C21,C22))<br /> return(C)<br /> }<br /><br /> if (nrow(A) != ncol(A)) <br /> { stop("only square matrices can be inverted") }<br /><br /> is.wholenumber <-<br /> function(x, tol = .Machine$double.eps^0.5) abs(x - round(x)) < tol<br /><br /> if ( (is.wholenumber(log(nrow(A), 2)) != TRUE) || (is.wholenumber(log(ncol(A), 2)) != TRUE) )<br /> { stop("only square matrices of dimension 2^k * 2^k can be inverted with Strassen method") }<br /><br /> A <- div4(A, dim(A)[1])<br /><br /> R1 <- strassenInv(A$X11)<br /> R2 <- strassen(A$X21 , R1)<br /> R3 <- strassen(R1 , A$X12)<br /> R4 <- strassen(A$X21 , R3)<br /> R5 <- R4 - A$X22<br /> R6 <- strassenInv(R5)<br /> C12 <- strassen(R3 , R6)<br /> C21 <- strassen(R6 , R2)<br /> R7 <- strassen(R3 , C21)<br /> C11 <- R1 - R7<br /> C22 <- -R6<br /> <br /> C <- rbind(cbind(C11,C12), cbind(C21,C22))<br /><br /> return(C)<br />}</pre><br /></div></div></div><br /><br /> <div style="margin-bottom: 2px;">Function <code style="color: rgb(0, 0, 0);">strassenInv3(A)</code><br /> <div style="margin-top: 5px; text-align: center;"><input value="Show" style="margin-top: 5px; width: 60px; font-size: 10px;" onclick="if (this.parentNode.parentNode.getElementsByTagName('div')[1].getElementsByTagName('div')[0].style.display != '') { this.parentNode.parentNode.getElementsByTagName('div')[1].getElementsByTagName('div')[0].style.display = ''; this.innerText = ''; this.value = 'Hide'; } else { this.parentNode.parentNode.getElementsByTagName('div')[1].getElementsByTagName('div')[0].style.display = 'none'; this.innerText = ''; this.value = 'Show'; }" type="button"> </div><br /> <div style="border: 1px inset ; margi : 0px; padding: 6px;"><div style="display: none;"><br /><pre><br />strassenInv3 <- function(A){<br /><br /> div4 <- function(A, r){<br /> A <- list(A)<br /> A11 <- A[[1]][1:(r/2),1:(r/2)]<br /> A12 <- A[[1]][1:(r/2),(r/2+1):r]<br /> A21 <- A[[1]][(r/2+1):r,1:(r/2)]<br /> A22 <- A[[1]][(r/2+1):r,(r/2+1):r]<br /> A <- list(X11=A11, X12=A12, X21=A21, X22=A22)<br /> return(A)<br /> }<br /><br /> strassen <- function(A, B){<br /> A <- div4(A, dim(A)[1])<br /> B <- div4(B, dim(B)[1])<br /> M1 <- (A$X11+A$X22) %*% (B$X11+B$X22)<br /> M2 <- (A$X21+A$X22) %*% B$X11<br /> M3 <- A$X11 %*% (B$X12-B$X22)<br /> M4 <- A$X22 %*% (B$X21-B$X11)<br /> M5 <- (A$X11+A$X12) %*% B$X22<br /> M6 <- (A$X21-A$X11) %*% (B$X11+B$X12)<br /> M7 <- (A$X12-A$X22) %*% (B$X21+B$X22)<br /><br /> C11 <- M1+M4-M5+M7<br /> C12 <- M3+M5<br /> C21 <- M2+M4<br /> C22 <- M1-M2+M3+M6<br /> <br /> C <- rbind(cbind(C11,C12), cbind(C21,C22))<br /> return(C)<br /> }<br /><br /> strassen2 <- function(A, B){<br /> A <- div4(A, dim(A)[1])<br /> B <- div4(B, dim(B)[1])<br /> M1 <- strassen((A$X11+A$X22) , (B$X11+B$X22))<br /> M2 <- strassen((A$X21+A$X22) , B$X11)<br /> M3 <- strassen(A$X11 , (B$X12-B$X22))<br /> M4 <- strassen(A$X22 , (B$X21-B$X11))<br /> M5 <- strassen((A$X11+A$X12) , B$X22)<br /> M6 <- strassen((A$X21-A$X11) , (B$X11+B$X12))<br /> M7 <- strassen((A$X12-A$X22) , (B$X21+B$X22))<br /><br /> C11 <- M1+M4-M5+M7<br /> C12 <- M3+M5<br /> C21 <- M2+M4<br /> C22 <- M1-M2+M3+M6<br /><br /> C <- rbind(cbind(C11,C12), cbind(C21,C22))<br /> return(C)<br /> }<br /><br /> if (nrow(A) != ncol(A)) <br /> { stop("only square matrices can be inverted") }<br /><br /> is.wholenumber <-<br /> function(x, tol = .Machine$double.eps^0.5) abs(x - round(x)) < tol<br /><br /> if ( (is.wholenumber(log(nrow(A), 2)) != TRUE) || (is.wholenumber(log(ncol(A), 2)) != TRUE) )<br /> { stop("only square matrices of dimension 2^k * 2^k can be inverted with Strassen method") }<br /><br /> A <- div4(A, dim(A)[1])<br /><br /> R1 <- strassenInv2(A$X11)<br /> R2 <- strassen2(A$X21 , R1)<br /> R3 <- strassen2(R1 , A$X12)<br /> R4 <- strassen2(A$X21 , R3)<br /> R5 <- R4 - A$X22<br /> R6 <- strassenInv2(R5)<br /> C12 <- strassen2(R3 , R6)<br /> C21 <- strassen2(R6 , R2)<br /> R7 <- strassen2(R3 , C21)<br /> C11 <- R1 - R7<br /> C22 <- -R6<br /> <br /> C <- rbind(cbind(C11,C12), cbind(C21,C22))<br /><br /> return(C)<br />}</pre><br /></div></div></div><br /><br />We run now some test. First check if the function successfully invert the matrix and compare them with the results of the standard R function (Function <code style="color: rgb(0, 0, 0);">solve()</code>):<br /><br /><pre><br />A <- matrix(trunc(rnorm(512*512)*100), 512,512)<br /><br />all( round(solve(A),8) == round(strassenInv(A),8) )<br />[1] TRUE<br /><br />all( round(solve(A),8) == round(strassenInv2(A),8) )<br />[1] TRUE<br /><br />all( round(solve(A),6) == round(strassenInv3(A),6) )<br />[1] TRUE<br /></pre><br /><br />The function performs the operations correctly. But there is a problem of approximation: in fact the first two functions are accurate to the eighth decimal place, while the third through sixth. Probably not an issue of calculus, but it is a problem of expression of numbers in binary format and 32-bit, which causes these errors.<br /><br />Now we analyze the computation time. See in the table the result, obtained by running the following code:<br /><br /> <div style="margin-bottom: 2px;">Time computation<br /> <div style="margin-top: 5px; text-align: center;"><input value="Show" style="margin-top: 5px; width: 60px; font-size: 10px;" onclick="if (this.parentNode.parentNode.getElementsByTagName('div')[1].getElementsByTagName('div')[0].style.display != '') { this.parentNode.parentNode.getElementsByTagName('div')[1].getElementsByTagName('div')[0].style.display = ''; this.innerText = ''; this.value = 'Hide'; } else { this.parentNode.parentNode.getElementsByTagName('div')[1].getElementsByTagName('div')[0].style.display = 'none'; this.innerText = ''; this.value = 'Show'; }" type="button"> </div><br /> <div style="border: 1px inset ; margi : 0px; padding: 6px;"><div style="display: none;"><br /><pre><br />A <- matrix(trunc(rnorm(512*512)*100), 512,512)<br />system.time(solve(A))<br />system.time(strassenInv(A))<br />system.time(strassenInv2(A))<br />system.time(strassenInv3(A))<br /><br />A <- matrix(trunc(rnorm(1024*1024)*100), 1024,1024)<br />system.time(solve(A))<br />system.time(strassenInv(A))<br />system.time(strassenInv2(A))<br />system.time(strassenInv3(A))<br /><br />A <- matrix(trunc(rnorm(2048*2048)*100), 2048,2048)<br />system.time(solve(A))<br />system.time(strassenInv(A))<br />system.time(strassenInv2(A))<br />system.time(strassenInv3(A))<br /><br />A <- matrix(trunc(rnorm(4096*4096)*100), 4096,4096)<br />system.time(solve(A))<br />system.time(strassenInv(A))<br />system.time(strassenInv2(A))<br />system.time(strassenInv3(A))</pre><br /></div></div></div><br /><br /><center><a target='_blank' href='http://img148.imageshack.us/img148/2208/tabella.jpg'><img src='http://img148.imageshack.us/img148/2208/tabella.th.jpg' border='0'/></a></center><br /><br />The results are quite obvious, and using a modification of Strassen algorithm for matrix inversion, there is a real time saving.<br /><br />Please, remember these two recommendations already made:<br />- The code is to be improved, and if anyone wants to help me, I will be happy to update my code<br />- If you consider it useful to use these function for any work, a citation is always welcome (contact me at my e-mail for details)</span>Todos Logoshttp://www.blogger.com/profile/09881188152777475558noreply@blogger.com2tag:blogger.com,1999:blog-4274823366855967619.post-88322992798997766552010-10-18T15:31:00.005+02:002010-10-18T22:09:24.626+02:00Fast matrix multiplication in R: Strassen's algorithmI tried to implement the Strassen's algorithm for big matrices multiplication in R.<br /><br /><span class="fullpost">Here I present a pdf with some theory element, some example and a possible solution in R.<br />I'm not a programmer, so the function is not optimize, but it works.<br /><br />I want to thank <a href="http://stackoverflow.com/users/382907/g-grothendieck" target="_blank">G. Grothendieck</a>: suggested me a very nice way on <a href="http://stackoverflow.com/questions/3948497/create-a-bigger-matrix-from-smaller-one/3948621#3948621" target="_blank">StackOverFlow</a> to create a bigger square matrix starting from small one.<br /><br />This is just a first version of the function; it needs more work on it. If someone want to collaborate, I'll be very happy.<br />Finally if you find my code useful for your work, I'd love to be cited (ask me via e-mail how to cite me: <span style="font-weight:bold;">todoslogos <span style="font-style:italic;">-at-</span> gmail . com</span>).<br /><br /><iframe width=100% height=560px frameborder=0 src=https://docs.google.com/gview?a=v&pid=explorer&chrome=false&api=true&embedded=true&srcid=0B5Soe5lALqepMmJjY2JiYjQtOTlhZi00ZTc2LThjODEtNzc0OTg0MjA0YzA3&hl=en></iframe><br /><br /></span>Todos Logoshttp://www.blogger.com/profile/09881188152777475558noreply@blogger.com4tag:blogger.com,1999:blog-4274823366855967619.post-23612460158245660132010-10-06T12:47:00.003+02:002010-10-06T12:51:01.200+02:00Convert decimal to IEEE-754 in RFor some theory on the standard IEEE-754, you can read the <a href="http://en.wikipedia.org/wiki/IEEE_754-2008" target="_blank">Wikipedia</a> page. Here I will post only the code of the function to make the conversion in R.<br /><br /><span class="fullpost"><br />First we write some functions to convert decimal numbers to binary numbers:<br /><br /><code style="color: rgb(153, 0, 0);"><br />decInt_to_8bit <- function(x, precs) {<br />q <- c()<br />r <- c()<br />xx <- c()<br />for(i in 1:precs){<br />xx[1] <- x<br />q[i] <- xx[i] %/% 2<br />r[i] <- xx[i] %% 2<br />xx[i+1] <- q[i]<br />}<br />rr <- rev(r)<br />return(rr)<br />}<br /><br />devDec_to_8bit <- function(x, precs) {<br />nas <- c()<br />nbs <- c()<br />xxs <- c()<br />for(i in 1:precs)<br />{<br />xxs[1] <- x*2<br />nas[i] <- (xxs[i]) - floor(xxs[i])<br />nbs[i] <- trunc(xxs[i], 1)<br />xxs[i+1] <- nas[i]*2<br />}<br />return(nbs)<br />}<br /></code><br /><br />For example, in 8-bit:<br /><br /><code style="color: rgb(153, 0, 0);"><br />decInt_to_8bit(11, 8)<br />[1] 0 0 0 0 1 0 1 1<br /></code><br /><br /><br /><code style="color: rgb(153, 0, 0);"><br />devDec_to_8bit(0.625, 8)<br />[1] 1 0 1 0 0 0 0 0<br /></code><br /><br /><br /><code style="color: rgb(153, 0, 0);"><br />devDec_to_8bit(0.3, 8)<br />[1] 0 1 0 0 1 1 0 0<br />devDec_to_8bit(0.3, 16)<br />[1] 0 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0<br /></code><br /><br />We can delete the extra-zeros from the vectors, using these functions:<br /><br /><code style="color: rgb(153, 0, 0);"><br />remove.zero.aft <- function(a) {<br />n <- length(a)<br />for(i in n:1){<br />if (a[n]==0) a <- a[-n]<br />else return(a)<br />n <- n-1<br />}<br />}<br /><br />remove.zero.bef <- function(a) {<br />n <- length(a)<br />for(i in 1:n){<br />if (a[1]==0) a <- a[-1]<br />else return(a)<br />}<br />}<br /></code><br /><br />So we have:<br /><br /><code style="color: rgb(153, 0, 0);"><br />remove.zero.bef(decInt_to_8bit(11, 8))<br />[1] 1 0 1 1<br /><br />remove.zero.aft(devDec_to_8bit(0.625, 8))<br />[1] 1 0 1<br /></code><br /><br />Binding these functions, we have:<br /><br /><code style="color: rgb(153, 0, 0);"><br />dec.to.nbit <- function(x,n) {<br />aa <- abs(trunc(x, 1))<br />bb <- abs(x) - abs(trunc(x))<br /><br />q <- c()<br />r <- c()<br />xx <- c()<br />for(i in 1:n){<br />xx[1] <- aa<br />q[i] <- xx[i] %/% 2<br />r[i] <- xx[i] %% 2<br />xx[i+1] <- q[i]<br />}<br />rr <- rev(r)<br /><br />nas <- c()<br />nbs <- c()<br />xxs <- c()<br />for(i in 1:n)<br />{<br />xxs[1] <- bb*2<br />nas[i] <- (xxs[i]) - floor(xxs[i])<br />nbs[i] <- trunc(xxs[i], 1)<br />xxs[i+1] <- nas[i]*2<br />}<br /><br />bef <- paste(remove.zero.bef(rr), collapse="")<br />aft <- paste(remove.zero.aft(nbs), collapse="")<br />bef.aft <- c(bef, aft)<br />strings <- paste(bef.aft, collapse=".")<br />return(strings)<br />}<br /></code><br /><br />Example:<br /><br /><code style="color: rgb(153, 0, 0);"><br />dec.to.nbit(11.625,8)<br />[1] "1011.101"<br /></code><br /><br /><center><hr width="50%"></center><br /><br />Now we can write the code for the decimal to IEEE-754 single float conversion in R:<br /><br /><code style="color: rgb(153, 0, 0);"><br />dec.to.ieee754 <- function(x) {<br />aa <- abs(trunc(x, 1))<br />bb <- abs(x) - abs(trunc(x))<br /><br />rr <- decInt_to_8bit(aa, 32)<br /><br />ppc <- 24 - length(remove.zero.bef(rr))<br /><br />nbs <- devDec_to_8bit(bb, ppc)<br /><br />bef <- remove.zero.bef(rr)<br />aft <- remove.zero.aft(nbs)<br /><br />exp <- length(bef) - 1<br />mantissa <- c(bef[-1], aft)<br /><br />exp.bin <- decInt_to_8bit(exp + 127, 16)<br />exp.bin <- remove.zero.bef(exp.bin)<br /><br />first <- c()<br />if (sign(x)==1) first=c(0)<br />if (sign(x)==-1) first=c(1)<br /><br />ieee754 <- c(first, exp.bin, mantissa, rep(0, 23-length(mantissa)))<br />ieee754 <- paste(ieee754, collapse="")<br /><br />return(ieee754)<br />}<br /></code><br /><br />The numbers 11.625 and 11.33 in IEEE-754 are:<br /><br /><code style="color: rgb(153, 0, 0);"><br />dec.to.ieee754(11.625)<br />[1] "01000001001110100000000000000000"<br /><br />dec.to.ieee754(11.33)<br />[1] "01000001001101010100011110101110"<br /></code><br /><br />You can verify the output with this <a href="http://www.binaryconvert.com/index.html" target="_blank">Online Binary-Decimal Converter</a></span>Todos Logoshttp://www.blogger.com/profile/09881188152777475558noreply@blogger.com1tag:blogger.com,1999:blog-4274823366855967619.post-13426246491241632982010-04-28T18:22:00.004+02:002010-04-28T18:29:42.811+02:00Bhapkar V testThis is the code to perform the Bhapkar V test. I've rapidly wrote it, in 2 hours. The code is then quite <i>brutal</i> and it could be done better. As soon as possible, I will correct it.<br /><br />WARNING: it works *ONLY* with 3 groups, for now!<br /><br /><pre name="code" class="java"><br />bhapkar.test.3g <- function(data1=list){<br /><br />sample <- c()<br />for(i in 1:length(data1)){<br />sample <- c(sample, rep(i, length(data1[[i]])))<br />}<br /><br />obs <- c()<br />for(i in 1:length(data1)){<br />obs <- c(obs, data1[[i]])<br />}<br />rank <- rank(obs)<br /><br />cplets <- list()<br />vec <- c()<br />for(i in 1:length(data1[[1]])){<br />vec <- c(vec, (length(data1[[2]][data1[[2]]>data1[[1]][i]]) * length(data1[[3]][data1[[3]]>data1[[1]][i]])))<br />}<br />cplets[[1]] <- vec<br /><br />vec <- c()<br />for(i in 1:length(data1[[2]])){<br />vec <- c(vec, (length(data1[[1]][data1[[1]]>data1[[2]][i]]) * length(data1[[3]][data1[[3]]>data1[[2]][i]])))<br />}<br />cplets[[2]] <- vec<br /><br />vec <- c()<br />for(i in 1:length(data1[[3]])){<br />vec <- c(vec, (length(data1[[2]][data1[[2]]>data1[[3]][i]]) * length(data1[[1]][data1[[1]]>data1[[3]][i]])))<br />}<br />cplets[[3]] <- vec<br /><br />cplets1 <- c(cplets[[1]], cplets[[2]], cplets[[3]])<br />mydata <- data.frame(obs=obs, sample=sample, rank=rank, cplets=cplets1)<br /><br />v1 <- sum(cplets[[1]])<br />v2 <- sum(cplets[[2]])<br />v3 <- sum(cplets[[3]])<br /><br />vtot <- v1+v2+v3<br />u1 <- v1/vtot<br />u2 <- v2/vtot<br />u3 <- v3/vtot<br />u <- c(u1,u2,u3)<br /><br />lengths <- c(length(data1[[1]]), length(data1[[2]]), length(data1[[3]]))<br />N <- sum(lengths)<br />P <- c(lengths / N)<br />ngroup <- length(data1)<br /><br />V <- N * (2*length(data1)-1)* (sum(P*((u-1/ngroup)^2)) - (sum(P*((u-1/ngroup))))^2)<br /><br />prop <- pchisq(V, df=length(data1)-1)<br />names(V) = "V = "<br />method = "Bhapkar V-test"<br />rval <- list(method = method, statistic = V, p.value = prop)<br />class(rval) = "htest"<br />return(rval)<br /><br /><br /><br />}</pre><br /><br />An example:<br /><br /><pre name="code" class="java"><br />a <- c(42, 46, 48.5, 49, 68, 51)<br />b <- c(70.5, 54, 60,72)<br />c <- c(66, 54, 43, 105, 94)<br /><br />mydata <- list(a,b,c)<br /><br />bhapkar.test.3g(mydata)<br /><br /><br /> Bhapkar V-test<br /><br />data: <br />V = 6.7713, p-value = 0.9661<br /><br /></pre><br /><br /><b>REFERENCES</b>:<br /><i>Statistical analysis of nonnormal data</i><br />By J. V. Deshpande, A. P. Gore, A. Shanubhogue<br />pag. 61Todos Logoshttp://www.blogger.com/profile/09881188152777475558noreply@blogger.com1tag:blogger.com,1999:blog-4274823366855967619.post-40694831120319827172010-01-06T13:21:00.003+01:002010-01-06T13:29:12.803+01:00Latin squares design in RThe Latin square design is used where the researcher desires to control the variation in an experiment that is related to rows and columns in the field.<br />Remember that:<br /> * Treatments are assigned at random within rows and columns, with each treatment once per row and once per column.<br /> * There are equal numbers of rows, columns, and treatments.<br /> * Useful where the experimenter desires to control variation in two different directions<br /><br />The formula used for this kind of three-way ANOVA are:<br /><br /><table border=1 cellspacing="0" cellpadding="5"><tr align="right" valign="top" bgcolor="cccccc"> <td align="left"><b>Source of<br>variation</b></td> <td align="center"><b>Degrees of<br>freedom<sup>a</sup></b></td> <td align="center"><b>Sums of<br>squares (SSQ)</b></td> <td align="center"><b>Mean<br>square (MS)</b></td> <td align="center"><b>F</b></td></tr><tr align="right" valign="center"> <td align="left">Rows (<i>R</i>)</td> <td align="center">r-1</td> <td align="center">SSQ<sub><i>R</i></sub></td> <td align="center">SSQ<sub><i>R</i></sub>/(r-1)</td> <td align="center">MS<sub><i>R</i></sub>/MS<sub><i>E</i></sub></td></tr><tr align="right" valign="center"> <td align="left">Columns (<i>C</i>)</td> <td align="center">r-1</td> <td align="center">SSQ<sub><i>C</i></sub></td> <td align="center">SSQ<sub><i>C</i></sub>/(r-1)</td> <td align="center">MS<sub><i>C</i></sub>/MS<sub><i>E</i></sub></td></tr><tr align="right" valign="center"> <td align="left">Treatments (<i>Tr</i>)</td> <td align="center">r-1</td> <td align="center">SSQ<sub><i>Tr</i></sub></td> <td align="center">SSQ<sub><i>Tr</i></sub>/(r-1)</td> <td align="center">MS<sub><i>Tr</i></sub>/MS<sub><i>E</i></sub></td></tr><tr align="right" valign="center"> <td align="left">Error (<i>E</i>)</td> <td align="center">(r-1)(r-2)</td> <td align="center">SSQ<sub><i>E</i></sub></td> <td align="center">SSQ<sub><i>E</i></sub>/((r-1)(r-2))</td> <td align="center"> </td></tr><tr align="right" valign="center"> <td align="left">Total (<i>Tot</i>)</td> <td align="center">r<sup>2</sup>-1</td> <td align="center">SSQ<sub><i>Tot</i></sub></td> <td align="center"> </td> <td align="center"> </td></tr><tr><td colspan=5><sup>a</sup>where r = number of (treatments=rows=columns).</td></tr></table><br /><br /><br /><span class="fullpost">Suppose you want to analyse the productivity of 5 kind on fertilizer, 5 kind of tillage, and 5 kind of seed. The data are organized in a latin square design, as follow:<br /><br /><pre><br /> treatA treatB treatC treatD treatE<br />fertilizer1 "A42" "C47" "B55" "D51" "E44" <br />fertilizer2 "E45" "B54" "C52" "A44" "D50" <br />fertilizer3 "C41" "A46" "D57" "E47" "B48" <br />fertilizer4 "B56" "D52" "E49" "C50" "A43" <br />fertilizer5 "D47" "E49" "A45" "B54" "C46" </pre><br /><br />The three factors are: fertilizer (fertilizer1:5), tillage (treatA:E), seed (A:E). The numbers are the productivity in cwt / year.<br /><br />Now create a dataframe in R with these data:<br /><br /><pre name="code" class="java"><br />fertil <- c(rep("fertil1",1), rep("fertil2",1), rep("fertil3",1), rep("fertil4",1), rep("fertil5",1))<br />treat <- c(rep("treatA",5), rep("treatB",5), rep("treatC",5), rep("treatD",5), rep("treatE",5))<br />seed <- c("A","E","C","B","D", "C","B","A","D","E", "B","C","D","E","A", "D","A","E","C","B", "E","D","B","A","C")<br />freq <- c(42,45,41,56,47, 47,54,46,52,49, 55,52,57,49,45, 51,44,47,50,54, 44,50,48,43,46)<br /> <br />mydata <- data.frame(treat, fertil, seed, freq)<br /><br />mydata<br /><br /> treat fertil seed freq<br />1 treatA fertil1 A 42<br />2 treatA fertil2 E 45<br />3 treatA fertil3 C 41<br />4 treatA fertil4 B 56<br />5 treatA fertil5 D 47<br />6 treatB fertil1 C 47<br />7 treatB fertil2 B 54<br />8 treatB fertil3 A 46<br />9 treatB fertil4 D 52<br />10 treatB fertil5 E 49<br />11 treatC fertil1 B 55<br />12 treatC fertil2 C 52<br />13 treatC fertil3 D 57<br />14 treatC fertil4 E 49<br />15 treatC fertil5 A 45<br />16 treatD fertil1 D 51<br />17 treatD fertil2 A 44<br />18 treatD fertil3 E 47<br />19 treatD fertil4 C 50<br />20 treatD fertil5 B 54<br />21 treatE fertil1 E 44<br />22 treatE fertil2 D 50<br />23 treatE fertil3 B 48<br />24 treatE fertil4 A 43<br />25 treatE fertil5 C 46</pre><br /><br />We can re-create the original table, using the matrix function:<br /><br /><pre name="code" class="java"><br />matrix(mydata$seed, 5,5)<br /><br /> [,1] [,2] [,3] [,4] [,5]<br />[1,] "A" "C" "B" "D" "E" <br />[2,] "E" "B" "C" "A" "D" <br />[3,] "C" "A" "D" "E" "B" <br />[4,] "B" "D" "E" "C" "A" <br />[5,] "D" "E" "A" "B" "C" <br /><br />matrix(mydata$freq, 5,5)<br /><br /> [,1] [,2] [,3] [,4] [,5]<br />[1,] 42 47 55 51 44<br />[2,] 45 54 52 44 50<br />[3,] 41 46 57 47 48<br />[4,] 56 52 49 50 43<br />[5,] 47 49 45 54 46</pre><br /><br />Before proceeding with the analysis of variance of this Latin square design, you should perform a Boxplot, aimed to have an idea of what we expect:<br /><br /><pre name="code" class="java"><br />par(mfrow=c(2,2))<br />plot(freq ~ fertil+treat+seed, mydata)</pre><br /><br /><center><a href="http://img684.imageshack.us/img684/1954/lsd1.jpg" target="_blank"><img src="http://img684.imageshack.us/img684/1954/lsd1.th.jpg" border="0" /></a></center><br /><br />Note that the differences considering the fertilizer is low; it is medium considering the tillage, and is very high considering the seed.<br />Now confirm these graphics observations, with the ANOVA table:<br /><br /><pre name="code" class="java"><br />myfit <- lm(freq ~ fertil+treat+seed, mydata)<br />anova(myfit)<br /><br />Analysis of Variance Table<br /><br />Response: freq<br /> Df Sum Sq Mean Sq F value Pr(>F) <br />fertil 4 17.760 4.440 0.7967 0.549839 <br />treat 4 109.360 27.340 4.9055 0.014105 * <br />seed 4 286.160 71.540 12.8361 0.000271 ***<br />Residuals 12 66.880 5.573 <br />---<br />Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 </pre><br /><br />Well, the boxplot was useful. Look at the significance of the F-test.<br />- The difference between group considering the fertilizer is not significant (p-value > 0.1);<br />- The difference between group considering the tillage is quite significant (p-value < 0.05);<br />- The difference between group considering the seed is very significant (p-value < 0.001);</span>Todos Logoshttp://www.blogger.com/profile/09881188152777475558noreply@blogger.com8tag:blogger.com,1999:blog-4274823366855967619.post-29076458591623439072009-09-05T20:26:00.000+02:002009-09-05T20:27:46.892+02:00Polynomial regression techniquesSuppose we want to create a polynomial that can approximate better the following dataset on the population of a certain Italian city over 10 years. The table summarizes the data:<br /><br />$$\begin{tabular}{|1|1|}\hline Year & Population\\ \hline 1959&4835\\ 1960&4970\\ 1961&5085\\ 1962&5160\\ 1963&5310\\ 1964&5260\\ 1965&5235\\ 1966&5255\\ 1967&5235\\ 1968&5210\\ 1969&5175\\ \hline \end{tabular}$$<br /><br /><span class="fullpost">First we import the data into R:<br /><br /><pre name="code" class="java"><br />Year <- c(1959, 1960, 1961, 1962, 1963, 1964, 1965, 1966, 1967, 1968, 1969)<br />Population <- c(4835, 4970, 5085, 5160, 5310, 5260, 5235, 5255, 5235, 5210, 5175)</pre><br /><br />Now we create the dataframe named sample1:<br /><br /><pre name="code" class="java"><br />sample1 <- data.frame(Year, Population)<br />sample1<br /><br /> Year Population<br />1 1959 4835<br />2 1960 4970<br />3 1961 5085<br />4 1962 5160<br />5 1963 5310<br />6 1964 5260<br />7 1965 5235<br />8 1966 5255<br />9 1967 5235<br />10 1968 5210<br />11 1969 5175</pre><br /><br />At this point may be useful to chart these values, to observe the trend and take an idea of the final polynomial function. For convenience we modify the column <code style="color: rgb(153, 0, 0);">Year</code>, creating a neighborhood of zero, thus:<br /><br /><pre name="code" class="java"><br />sample1$Year <- sample1$Year - 1964<br />sample1<br /><br /> Year Population<br />1 -5 4835<br />2 -4 4970<br />3 -3 5085<br />4 -2 5160<br />5 -1 5310<br />6 0 5260<br />7 1 5235<br />8 2 5255<br />9 3 5235<br />10 4 5210<br />11 5 5175</pre><br /><br />Put the values on a chart<br /><br /><pre name="code" class="java"><br />plot(sample1$Year, sample1$Population, type="b")</pre><br /><br /><center><a href="http://img137.imageshack.us/img137/7946/pol1.jpg" target="_blank"><img src="http://img137.imageshack.us/img137/7946/pol1.th.jpg" border="0"/></a></center><br /><br />At this point we can start with the search for a polynomial model that adequately approximates our data. First, we specify that we want a polynomial function of X, ie a <em>raw polynomial</em> , is different from the <em>orthogonal polynomial</em>. This is an important addition because the controls and the results will change in the two cases R. So we want a function of X like:<br /><br />$$f(x)=\beta_0+\beta_1x+\beta_2x^2+\beta_3x^3+ ... +\beta_nx^n$$<br /><br />At what degree of the polynomial stop? Depends on the degree of precision that we seek. The greater the degree of the polynomial, the greater the accuracy of the model, but the greater the difficulty in calculating; we must also verify the significance of coefficients that are found. But let's get straight to the point.<br /><br />In R for <b>fitting a polynomial regression model</b> (not orthogonal), there are two methods, among them identical. Suppose we seek the values of beta coefficients for a polynomial of degree 1, then 2nd degree, and 3rd degree:<br /><br /><pre name="code" class="java"><br />fit1 <- lm(sample1$Population ~ sample1$Year)<br />fit2 <- lm(sample1$Population ~ sample1$Year + I(sample1$Year^2))<br />fit3 <- lm(sample1$Population ~ sample1$Year + I(sample1$Year^2) + I(sample1$Year^3))</pre><br /><br />Or we can write more quickly, for polynomials of degree 2 and 3:<br /><br /><pre name="code" class="java"><br />fit2b <- lm(sample1$Population ~ poly(sample1$Year, 2, raw=TRUE))<br />fit3b <- lm(sample1$Population ~ poly(sample1$Year, 3, raw=TRUE))</pre><br /><br />The function <code style="color: rgb(153, 0, 0);">poly</code> is useful if you want to get a polynomial of high degree, because it avoids explicitly write the formula. If we specify <code style="color: rgb(153, 0, 0);">raw=TRUE</code>, the two methods provide the same output, but if we do not specify <code style="color: rgb(153, 0, 0);">raw=TRUE</code> (or rgb(153, 0, 0);">raw=F</code>), the function <code style="color: rgb(153, 0, 0);">poly</code> give us the values of the beta parameters of an orthogonal polynomials, which is different from the general formula I wrote above, although the models are both effective.<br /><br />Let's look at the output. <br /><br /><pre name="code" class="java"><br />summary(fit2)<br />## or summary(fit2b)<br /><br />Call:<br />lm(formula = sample1$Population ~ sample1$Year + I(sample1$Year^2))<br /><br />Residuals:<br /> Min 1Q Median 3Q Max <br />-46.888 -18.834 -3.159 2.040 86.748 <br /><br />Coefficients:<br /> Estimate Std. Error t value Pr(>|t|) <br />(Intercept) 5263.159 17.655 298.110 < 2e-16 ***<br />sample1$Year 29.318 3.696 7.933 4.64e-05 ***<br />I(sample1$Year^2) -10.589 1.323 -8.002 4.36e-05 ***<br />---<br />Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 <br /><br />Residual standard error: 38.76 on 8 degrees of freedom<br />Multiple R-squared: 0.9407, Adjusted R-squared: 0.9259 <br />F-statistic: 63.48 on 2 and 8 DF, p-value: 1.235e-05 </pre><br /><br />The output of <code style="color: rgb(153, 0, 0);">summary(fit2b)</code> is the same. We obtained the values of beta0 (5263,159), beta1 (29,318) and beta2 (-10,589), which appear to be significant AII 3. The equation of polynomial of degree 2 of our model is:<br /><br />$$f(x)=5263.1597+29.318x-10.589x^2$$<br /><br />If we want a polynomial of 3rd degree, we have:<br /><br /><pre name="code" class="java"><br />summary(fit3)<br />## of summary(fit3b)<br /><br />Call:<br />lm(formula = sample1$Population ~ sample1$Year + I(sample1$Year^2) + <br /> I(sample1$Year^3))<br /><br />Residuals:<br /> Min 1Q Median 3Q Max <br />-32.774 -14.802 -1.253 3.199 72.634 <br /><br />Coefficients:<br /> Estimate Std. Error t value Pr(>|t|) <br />(Intercept) 5263.1585 15.0667 349.324 4.16e-16 ***<br />sample1$Year 14.3638 8.1282 1.767 0.1205 <br />I(sample1$Year^2) -10.5886 1.1293 -9.376 3.27e-05 ***<br />I(sample1$Year^3) 0.8401 0.4209 1.996 0.0861 . <br />---<br />Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 <br /><br />Residual standard error: 33.08 on 7 degrees of freedom<br />Multiple R-squared: 0.9622, Adjusted R-squared: 0.946 <br />F-statistic: 59.44 on 3 and 7 DF, p-value: 2.403e-05 </pre><br /><br />The equation is:<br /><br />$$f(x)=5263.1585+14.3638x-10.5886x^2+0.8401x^3$$<br /><br />In the latter case, however, the coefficients beta1 and beta3 are not significant, then the best model is the polynomial of 2nd degree. Furthermore look at the Multiple R-squared: in the 2nd degree model it is 94.07%, while in the 3rd degree model it is 96.22%. It seems that there has been an increase in accuracy of the model, but it is a significant increase? We can compare the two model with an ANOVA table:<br /><br /><pre name="code" class="java"><br />anova(fit2, fit3)<br /><br />Analysis of Variance Table<br /><br />Model 1: sample1$Population ~ sample1$Year + I(sample1$Year^2)<br />Model 2: sample1$Population ~ sample1$Year + I(sample1$Year^2) + I(sample1$Year^3)<br /> Res.Df RSS Df Sum of Sq F Pr(>F) <br />1 8 12019.8 <br />2 7 7659.5 1 4360.3 3.9848 0.0861 .<br />---<br />Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 </pre><br /><br />Since the p-value is greater than 0.05, we accept the null hypothesis: there wasn't a significant improvement of the model.<br /><br />The biggest problem now is to represent graphically the result. In fact, R does not exist (as far as I know) a function for plotting polynomials found. We must therefore proceed with graphic artifacts still valid, but somewhat laborious.<br /><br />First, we plotted the values, with the command seen before. This time only display the lines and not points, for convenience graphics:<br /><br /><pre name="code" class="java"><br />plot(sample1$Year, sample1$Population, type="l", lwd=3)</pre><br /><br />Now add to this chart the progress of the 2nd degree polynomial, in this way:<br /><br /><pre name="code" class="java"><br />points(sample1$Year, predict(fit2), type="l", col="red", lwd=2)</pre><br /><br />The function <code style="color: rgb(153, 0, 0);">predict()</code> compute the Y values given the X values. The the coordinates are linked with lines. Is not plotted the continuous, but the discrete. With a few values, this method is highly debilitating.<br /><br /><center><a href="http://img245.imageshack.us/img245/9274/pol2d.jpg" target="_blank"><img src="http://img245.imageshack.us/img245/9274/pol2d.th.jpg" border="0"/></a></center><br /><br />Let's add the graph of the polynomial of 3rd degree:<br /><br /><pre name="code" class="java"><br />points(sample1&Year, predict(fit3), type="l", col="blue", lwd=2)</pre><br /><br /><center><a href="http://img269.imageshack.us/img269/6797/pol3k.jpg" target="_blank"><img src="http://img269.imageshack.us/img269/6797/pol3k.th.jpg" border="0"/></a></center><br /><br />As you can see the two models have very similar trends.<br /><br />If we would instead obtain the graph of continuous functions obtained, we proceed in this manner. First you create the polynomial equation we previously found:<br /><br /><pre name="code" class="java"><br />pol2 <- function(x) fit2$coefficient[3]*x^2 + fit2$coefficient[2]*x + fit2$coefficient[1]</pre><br /><br />Remember that:<br />- coefficient[1] = beta0<br />- coefficient[2] = beta1<br />- coefficient[3] = beta2<br />and so on.<br /><br />At this point we plotted the coordinates of sample1 and then the created curve with <code style="color: rgb(153, 0, 0);">curve(x)</code>:<br /><br /><pre name="code" class="java"><br />plot(sample1$Year, sample1$Population, type="p", lwd=3)<br />pol2 <- function(x) fit2$coefficient[3]*x^2 + fit2$coefficient[2]*x + fit2$coefficient[1]<br />curve(pol2, col="red", lwd=2)</pre><br /><br />The point, however, disappear, but we can replace them with the command <code style="color: rgb(153, 0, 0);">points</code>:<br /><br /><pre name="code" class="java"><br />points(sample1$Year, sample1$Population, type="p", lwd=3)</pre><br /><br />A note: you must follow the order of commands as I have described, otherwise the function <code style="color: rgb(153, 0, 0);">curve</code> creates a wrong graph. So summarizing the commands to get the continuous function, and the experimental points on the same graph are the following:<br /><br /><pre name="code" class="java"><br />plot(sample1$Year, sample1$Population, type="p", lwd=3)<br />pol2 <- function(x) fit2$coefficient[3]*x^2 + fit2$coefficient[2]*x + fit2$coefficient[1]<br />curve(pol2, col="red", lwd=2)<br />points(sample1$Year, sample1$Population, type="p", lwd=3)</pre><br /><br />The graph we get is the following:<br /><br /><center><a href="http://img245.imageshack.us/img245/4396/pol4.jpg" target="_blank"><img src="http://img245.imageshack.us/img245/4396/pol4.th.jpg" border="0"/></a></center><br /><br />Now draw the graph of the polynomial of 3rd degree:<br /><br /><pre name="code" class="java"><br />plot(sample1$Year, sample1$Population, type="p", lwd=3)<br />pol3 <- function(x) fit3$coefficient[4]*x^3 + fit3$coefficient[3]*x^2 + fit3$coefficient[2]*x + fit3$coefficient[1]<br />curve(pol3, col="red", lwd=2)<br />points(sample1$Year, sample1$Population, type="p", lwd=3)</pre><br /><br /><center><a href="http://img269.imageshack.us/img269/261/pol5e.jpg" target="_blank"><img src="http://img269.imageshack.us/img269/261/pol5e.th.jpg" border="0"/></a></center><br /></span>Todos Logoshttp://www.blogger.com/profile/09881188152777475558noreply@blogger.com8tag:blogger.com,1999:blog-4274823366855967619.post-23790979299193350202009-08-25T20:44:00.004+02:002009-08-27T17:18:06.304+02:00Web-site trend analysis with data from Google AnalyticsThis post is a summary of my two previous posts on the <b>trend analysis</b> with the <a href="http://statistic-on-air.blogspot.com/2009/08/trend-analysis-with-cox-stuart-test-in.html">Cox-Stuart test</a> and on <a href="http://statistic-on-air.blogspot.com/2009/08/simple-linear-regression.html">simple linear regression</a>. The goal that we propose is to assess the trend in the number of visits received from a site over a long time. I use <em>Google Analytics</em>, because this tool allows us to save the various reports in <em>Excel CSV format</em>. Let's see, step by step, how to save the reportage, and then <b>how to import data from Excel to R</b>, and finally how to estimate if the number of daily visitors follows an increasing or decreasing trend.<br /><br />Let's start by creating an ad hoc report in Google Analytics. Once you have logged in, select the date range that we want to analyze. Then click on<i>Visits</i>.<br /><br /><span class="fullpost"><center><a target='_blank' href='hhttp://img25.imageshack.us/img25/5784/62278291.jpg'><img src='http://img25.imageshack.us/img25/5784/62278291.th.jpg' border='0'/></a></center><br /><br />At this point we can save the report, clicking on <i>Export</i> and then clicking on <b>CSV for Excel</b>.<br /><br /><center><a target='_blank' href='http://img253.imageshack.us/img253/9418/38168132.jpg'><img src='http://img253.imageshack.us/img253/9418/38168132.th.jpg' border='0'/></a></center><br /><br />Save the CSV file, and open it with Excel. Here's how it seems:<br /><br /><center><a target='_blank' href='http://img443.imageshack.us/img443/1205/83377368.jpg'><img src='http://img443.imageshack.us/img443/1205/83377368.th.jpg' border='0'/></a></center><br /><br />Now import the data into R. Import data from Excel to R is very simple. Simply select the column (or columns) of our interest (in our case the column <code style="color: rgb(153, 0, 0);">Visits</code>) and copy in the clipboard with CTRL + C (remember to select the cell <code style="color: rgb(153, 0, 0);">Visits</code>, because it will be useful):<br /><br />Then open R and type the following command:<br /><br /><pre name="code" class="java"><br />myvisit <- read.delim("clipboard")<br /><br />myvisit<br /><br /> Visits<br />1 33<br />2 41<br />3 34<br />4 45<br />5 46<br />6 37<br />7 31<br />8 37<br />9 34<br />10 34<br />11 48<br />12 39<br />13 33<br />...</pre><br /><br /><br />It is a one column dataframe; the name of the column is Visits (so it is importat to select the header from Excel).<br /><br />Now we can proceed with the analysis of trends in the two proposed ways: through a <b>Cox-Stuart test</b> e through the <b>analysis of the simple linear regression</b>.<br /><br />The function to perform the Cox-Stuart test is available <a href="http://statistic-on-air.blogspot.com/2009/08/trend-analysis-with-cox-stuart-test-in.html">here</a>. First we must convert the dataframe in a format that can be read by the function cox.stuart.test, like this:<br /><br /><pre name="code" class="java"><br />visits <- c(myvisit$Visits)</pre><br /><br />I have created in this way, a vector (visits) that contains all data that were ordered in the column Visits of the dataframe myvisit. Now we provide a test of Cox-Stuart:<br /><br /><pre name="code" class="java"><br />cox.stuart.test(visits)<br /><br /> Cox-Stuart test for trend analysis<br /><br />data: <br />Increasing trend, p-value = 0.0012</pre><br /><br />The output is very clear: We have detected an increasing trend of visits, highly significant (since <i>p-value < 0.5</i>). <br /><br /><center><hr width=50%></center><br /><br />If we are not satisfied or sure of this result, we can take into account the slope of the regression line. Firstly may want to show the results. The vector contains the hits daily visits to the site. Now we create a sorted array of the days in question, the same length of the carrier hits:<br /><br /><pre name="code" class="java"><br />days <- c(1 : length(visits))</pre><br /><br />Create a plot:<br /><br /><pre name="code" class="java"><br />plot(days, visits, type="b")</pre><br /><br />Choosing <code style="color: rgb(153, 0, 0);">type="b"</code> I see dots and lines, as shown in figure:<br /><br /><center><a target='_blank' href='http://img145.imageshack.us/img145/2041/88334099.jpg'><img src='http://img145.imageshack.us/img145/2041/88334099.th.jpg' border='0'/></a></center><br /><br />From this plot is not easy to observe a possible trend of the progress of visits. We can still do a regression analysis. Evaluating the sign of the slope of the line, we can estimate whether the trend is increasing or decreasing:<br /><br /><pre name="code" class="java"><br />fit <- lm(visits ~ days)<br />summary(fit)<br /><br />Call:<br />lm(formula = visits ~ days)<br /><br />Residuals:<br /> Min 1Q Median 3Q Max <br />-22.714 -6.197 -1.313 5.648 31.153 <br /><br />Coefficients:<br /> Estimate Std. Error t value Pr(>|t|) <br />(Intercept) 31.79694 2.27151 13.998 < 2e-16 ***<br />days 0.19815 0.04242 4.671 1.04e-05 ***<br />---<br />Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 <br /><br />Residual standard error: 10.81 on 90 degrees of freedom<br />Multiple R-squared: 0.1951, Adjusted R-squared: 0.1862 <br />F-statistic: 21.82 on 1 and 90 DF, p-value: 1.043e-05 </pre><br /><br />The slope coefficient has a value of: <code style="color: rgb(153, 0, 0);">b = 0.06251</code>. It therefore has a positive sign, then one may think of an increasing trend. The value of the statistical t-test on the slope, and its relative p-value, indicate either that it is significant. We can therefore say that there is an increasing trend.<br /><br />Finally, we can see the regression line directly on the plot previously obtained in this way:<br /><br /><pre name="code" class="java"><br />plot(days, visits, type="b")<br />abline(fit, col="red", lwd=3)</pre><br /><br />The command abline allows us to add a line defined by the equation given, directly on the chart shown; the parameter "col" specifies the color and the "lwd" parameter specifies the thickness of the line. Observe now the graph:<br /><br /><center><a target='_blank' href='http://img294.imageshack.us/img294/9043/36017503.jpg'><img src='http://img294.imageshack.us/img294/9043/36017503.th.jpg' border='0'/></a></center><br /><br />It's obvious that there is an increasing trend, as said by the Cox-Stuart test.</span>Todos Logoshttp://www.blogger.com/profile/09881188152777475558noreply@blogger.com1tag:blogger.com,1999:blog-4274823366855967619.post-82378564033877925332009-08-20T12:43:00.009+02:002009-08-20T13:05:32.095+02:00Simple logistic regression on qualitative dichotomic variablesIn this post we will see briefly how to implement a logistic regression model if you have categorical variables, or qualitative, organized in double entry contingency tables. In this model the dependent variable (Y) and independent variable (X) are both dichotomies (or Bernoullian).<br /><br />In general, the probability that Y = 1 as a function of <b>predictors</b> is:<br /><br />$$P(Y=1|X=x)=\pi(x)=\frac{exp(\beta_0+\beta_1x_1+\cdots +\beta_kx_k)}{1+exp(\beta_0+\beta_1x_1+\cdots +\beta_kx_k)}$$<br /><br />Our goal is to estimate the value of the beta parameters (<b>regressors</b>).<br /><br /><span class="fullpost">We begin to examine a <b>model of simple logistic regression</b> (with <i>only one predictor</i>).<br /><br />Consider the following example. The table below shows the results of a study on gastroesophageal reflux. You want to evaluate how the presence of a stress factor can influence the onset of this disease.<br /><br /><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 244px; height: 141px;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjliXaOM1AdZGxUHF6noIzk5_GoKbdgnM9sYXekRIcOvQT_iOXSk3wdLMLEA4EfEhqMg96EFmMgLwd3T8OLxEzJcJHDSEVYv6XwYwMffl4C6wt5CO_7fEq7b7za9zdSoZ2Ichjg_1FnEv4/s400/refl.JPG" border="0" alt=""id="BLOGGER_PHOTO_ID_5371997672767937474" /><br /><br />First we import the values in R. We must create a table with double entry; proceed as follows:<br /><br /><pre name="code" class="java"><br />reflux <- matrix(c(251,131,4,33), nrow=2)<br />colnames(reflux) <- c("reflNO", "reflYES")<br />rownames(reflux) <- c("stressNO", "stressYES")<br />table <- as.table(reflux)<br /><br />table<br /><br /> reflNO reflYES<br />stressNO 251 4<br />stressYES 131 33</pre><br /><br />Now adjust the data for the logistic regression. We must create a data frame:<br /><br /><pre name="code" class="java"><br />dft <- as.data.frame(table)<br />dft<br /><br /> Var1 Var2 Freq<br />1 stressNO reflNO 251<br />2 stressYES reflNO 131<br />3 stressNO reflYES 4<br />4 stressYES reflYES 33</pre><br /><br /><br />We can now fit the model, and then perform the logistic regression in R:<br /><br /><pre name="code" class="java"><br />fit <- glm(Var2 ~ Var1, weights = Freq, data = dft, family = binomial(logit))<br />summary(fit)<br /><br /><br />Call:<br />glm(formula = Var2 ~ Var1, family = binomial(logit), data = dft, <br /> weights = Freq)<br /><br />Deviance Residuals: <br /> 1 2 3 4 <br />-2.817 -7.672 5.765 10.287 <br /><br />Coefficients:<br /> Estimate Std. Error z value Pr(>|z|) <br />(Intercept) -4.1392 0.5040 -8.213 < 2e-16 ***<br />Var1stressYES 2.7605 0.5403 5.109 3.23e-07 ***<br />---<br />Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 <br /><br />(Dispersion parameter for binomial family taken to be 1)<br /><br /> Null deviance: 250.23 on 3 degrees of freedom<br />Residual deviance: 205.86 on 2 degrees of freedom<br />AIC: 209.86<br /><br />Number of Fisher Scoring iterations: 6</pre><br /><br /><br />First we comment the code to perform the regression. The logistic regression is called imposing the family: <code style="color: rgb(153, 0, 0);">family = binomial(logit)</code>. The code <code style="color: rgb(153, 0, 0);">Var2 ~ Var1</code> means that we want to create a model that will explain the variable var2 (presence or absence of reflux) as a function of the variable var1 (presence or absence of stressful events). In practice var2 is the independent variable Y, and Var1 is the dependent variable X (the regressors). Provided the formula to be analyzed, you specify the weight of each variable, data in column <code style="color: rgb(153, 0, 0);">Freq</code> of the dataframe <code style="color: rgb(153, 0, 0);">dft</code> (so we write <code style="color: rgb(153, 0, 0);">weights = Freq</code> and <code style="color: rgb(153, 0, 0);">data = dft</code> to specify the location where the values are contained).<br /><br />The values of the parameters $\beta_0$ and $\beta_1$ are respectively the values <code style="color: rgb(153, 0, 0);">(intercept)</code> and <code style="color: rgb(153, 0, 0);">Var1stress1</code>. We can then write our empirical model:<br /><br />$$\pi(x)=\frac{exp(-4.139+2.760x)}{1+exp(-4.139+2.760x)}$$<br /><br />The independent variable x can be zero or one. If you assume value 0 (ie in the absence of stressful events), then the probability of having reflux is:<br /><br />$$\pi(x=0)=\frac{exp(\beta_0)}{1+exp(\beta_0)}=0.016=1.6\%$$<br /><br />If there are stressful events (x = 1), the probability of having reflux is:<br /><br />$$\pi(x=1)=\frac{exp(\beta_0+\beta_1)}{1+exp(\beta_0+\beta_1)}=0.20=20\%$$<br /><br />The odds are:<br /><br />$$odds(x=1)=\frac{\pi(1)}{1-\pi(1)}=exp(\beta_0+\beta_1)$$<br /><br />$$odds(x=0)=\frac{\pi(0)}{1-\pi(0)}=exp(\beta_0)$$<br /><br />We can finally calculate the odd ratio OR:<br /><br />$$OR=\frac{odds(x=1)}{odds(x=0)}=15.807$$<br /><br />A person who has experienced a stressful event has a propensity to develop gastroesophageal reflux 15.807 times larger than the person who has not undergone stressful events.<br /><br />The probabilities and the odds can be readily calculated in R recalling that:<br /><br /><code style="color: rgb(153, 0, 0);">fit$coefficient[1]</code> = $\beta_0$ (intercept)<br /><code style="color: rgb(153, 0, 0);">fit$coefficient[2]</code> = $\beta_1$<br /><br />Furthermore:<br /><br /><code style="color: rgb(153, 0, 0);">summary(fit)$coefficient[1,2]</code> = standard error of $\beta_0$<br /><code style="color: rgb(153, 0, 0);">summary(fit)$coefficient[2,2]</code> = standard error of $\beta_1$<br /><br />And so we have:<br /><br /><pre name="code" class="java"><br />pi0 <- exp(fit$coefficient[1]) / (1 + exp(fit$coefficient[1]))<br />pi1 <- exp(fit$coefficient[1] + fit$coefficient[2]) / (1 + exp(fit$coefficient[1]+fit$coefficient[2]))<br /><br />odds0 <- pi0 / (1 - pi0)<br />odds1 <- pi1 / (1 - pi1)<br /><br />OR <- odds1 / odds0<br /><br />#the same result with:<br />OR <- exp(fit$coefficient[2])<br /><br />#the confidence interval for OR is:<br />ORmin <- exp( fit$coefficient[2] - qnorm(.975) * summary(fit)$coefficient[2,2] )<br /><br />ORmax <- exp( fit$coefficient[2] + qnorm(.975) * summary(fit)$coefficient[2,2] )</pre><br /><br />We can obtain the same result for the odd-ratio, using the simplify formula:<br /><br />$$OR=\frac{ad}{bc}=\frac{251\cdot33}{4\cdot131}=15.807$$<br /><br />that in R is:<br /><br /><pre name="code" class="java"><br />OR <- (table[1,1]*table[2,2]) / (table[1,2]*table[2,1])</pre><br /><br />The acronym AIC stands for <b>Akaike's information criterion</b>. This parameter does not provide any data on the model just created. It<br />may be useful in comparing this model with other possibly taken into account (the model with lowest AIC is the better).</span>Todos Logoshttp://www.blogger.com/profile/09881188152777475558noreply@blogger.com0tag:blogger.com,1999:blog-4274823366855967619.post-8051173505248997682009-08-08T09:59:00.004+02:002010-10-17T18:47:58.958+02:00Trend Analysis with the Cox-Stuart test in RThe <b>Cox-Stuart test</b> is defined as a little powerful test (power equal to 0.78), but very robust for the <b>trend analysis</b>. It is therefore applicable to a wide variety of situations, to get an idea of the evolution of values obtained. The proposed method is based on the <em>binomial distribution</em>. In R there is no function to perform a test of Cox-Stuart, so now we see the logical steps that are the basis of test and finally we can write the function ourself.<br /><br /><span class="fullpost">You want to assess whether there is an <em>increasing or decreasing trend</em> of the number of daily customers of a restaurant. We have the number of customers in 15 days:<br /><div style="text-align: center;">Customers: 5, 9, 12, 18, 17, 16, 19, 20, 4, 3, 18, 16, 17, 15, 14</div><br /><br />To perform the test of Cox-Stuart, the number of observations must be even. In our case we have 15 observations. Delete, therefore, the observation at position (N+1)/2 (here the observation with value = 20):<br /><br /><pre name="code" class="java"><br />customers = c(5, 9, 12, 18, 17, 16, 19, 20, 4, 3, 18, 16, 17, 15, 14)<br /><br />length(customers)<br />[1] 15<br /><br />cust_even = customers[ -(length(customers)+1)/2 ]<br />length(cust_even)<br />[1] 14</pre><br /><br />Now we have 14 observations, and we can then proceed. Divide the observations into two vectors, the first containing the first half of the measures, and the second the second half:<br /><br /><pre name="code" class="java"><br />fHalf = cust_even[1:7]<br />sHalf = cust_even[8:14]<br /><br />fHalf<br />[1] 5 9 12 18 17 16 19<br /><br />sHalf<br />[1] 4 3 18 16 17 15 14</pre><br /><br />Now subtract, value by value, the content of the two vectors:<br /><br /><pre name="code" class="java"><br />difference = fHalf - sHalf<br /><br />difference<br />[1] 1 6 -6 2 0 1 5</pre><br /><br />Now consider only the signs of the contents of the vector difference<br /><br /><pre name="code" class="java"><br />signs = sign(difference)<br /><br />signs<br />[1] 1 1 -1 1 0 1 1</pre><br /><br />A difference has value 0 and therefore also in the vector with the signs there is a value equal to 0. This must be eliminated:<br /><br /><pre name="code" class="java"><br />signs = signs[ signs != 0 ]<br /><br />signs<br />[1] 1 1 -1 1 1 1</pre><br /><br />We obtained six differences, and then six signs. Now we have to count the number of positive-signs and the number of negative-signs:<br /><br /><pre name="code" class="java"><br />pos = signs[signs > 0]<br />neg = signs[signs < 0]<br /><br />length(pos)<br />[1] 5<br /><br />length(neg)<br />[1] 1</pre><br /><br />Now we choose the number of signs that is smaller. In this case we choose the number of negative signs (1). We compute the probability to obtain x = 1 successes on N = 6 experiments, each of which yields success with probability p = 0.5 (binomial distribution):<br /><br /><pre name="code" class="java"><br />pbinom(1, 6, 0.5)<br />[1] 0.109375</pre><br /><br />The value so calculated is higher than 0.05 (we choose a significance level of 95%). Therefore there is no significant trend (which would have been in decline since the number of negative signs is minor).<br />If the value was less than 0.05, we accepted the hypothesis of a significant trend.<br /><br />Now try to fit a regression model, and observe the p-value of the slope: the coefficient b is not significant.<br /><br /><pre name="code" class="java"><br />customers = c(5, 9, 12, 18, 17, 16, 19, 20, 4, 3, 18, 16, 17, 15, 14)<br />days <- c(1:length(customers))<br />model <- lm(customers ~ days)<br />summary(model)<br /><br />Call:<br />lm(formula = customers ~ days)<br /><br />Residuals:<br /> Min 1Q Median 3Q Max <br />-11.090 -2.173 1.352 3.967 6.467 <br /><br />Coefficients:<br /> Estimate Std. Error t value Pr(>|t|) <br />(Intercept) 11.3048 3.1104 3.634 0.00303 **<br />days 0.2786 0.3421 0.814 0.43014 <br />---<br />Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 <br /><br />Residual standard error: 5.724 on 13 degrees of freedom</pre><br /><br /><br />Here is the code to perform a Cos-Stuart test, written by me.<br /><br /><pre name="code" class="java"><br />cox.stuart.test =<br />function (x)<br />{<br /> method = "Cox-Stuart test for trend analysis"<br /> leng = length(x)<br /> apross = round(leng) %% 2<br /> if (apross == 1) {<br /> delete = (length(x)+1)/2<br /> x = x[ -delete ] <br /> }<br /> half = length(x)/2<br /> x1 = x[1:half]<br /> x2 = x[(half+1):(length(x))]<br /> difference = x1-x2<br /> signs = sign(difference)<br /> signcorr = signs[signs != 0]<br /> pos = signs[signs>0]<br /> neg = signs[signs<0]<br /> if (length(pos) < length(neg)) {<br /> prop = pbinom(length(pos), length(signcorr), 0.5)<br /> names(prop) = "Increasing trend, p-value"<br /> rval <- list(method = method, statistic = prop)<br /> class(rval) = "htest"<br /> return(rval)<br /> }<br /> else {<br /> prop = pbinom(length(neg), length(signcorr), 0.5)<br /> names(prop) = "Decreasing trend, p-value"<br /> rval <- list(method = method, statistic = prop)<br /> class(rval) = "htest"<br /> return(rval)<br /> }<br />}</pre><br /><br />We can now use the function just created:<br /><br /><pre name="code" class="java"><br />customers = c(5, 9, 12, 18, 17, 16, 19, 20, 4, 3, 18, 16, 17, 15, 14)<br />cox.stuart.test(customers)<br /><br /> Cox-Stuart test for trend analysis<br /><br />data: <br />Decreasing trend, p-value = 0.1094</pre><br /></span>Todos Logoshttp://www.blogger.com/profile/09881188152777475558noreply@blogger.com9tag:blogger.com,1999:blog-4274823366855967619.post-29828205066751299472009-08-07T20:28:00.002+02:002009-08-07T20:34:53.813+02:00Two-way analysis of variance: two-way ANOVA in RThe <b><a href="http://statistic-on-air.blogspot.com/2009/07/analysis-of-variance-anova-for-multiple.html">one-way analysis of variance</a></b> is a useful technique to verify if the means of more groups are equals. But this analysis may not be very useful for more complex problems. For example, it may be necessary to take into account two factors of variability to determine if the averages between the groups depend on the group classification ( "zone") or the second variable that is to consider ("block"). In this case should be used the <b>two-way analysis of variance</b> (<b>two-way ANOVA</b>).<br /><br /><span class="fullpost">We begin immediately with an example so as to facilitate the understanding of this statistical method. The data collected are organized into <b> double entry tables</b>.<br /><br />The director of a company has collected revenue (thousand dollars) for 5 years and under per month. You want to see if the revenue depends on the year and/or month, or if they are independent of these two factors.<br /><br />Conceptually, the problem may be solved by an <em>horizontal ANOVA</em> and a <em>vertical ANOVA</em>, in order to verify if the average revenues per year are the same, and if they are equal to the average revenue computed by month. This would require many calculations, and so we prefer to use the two-way ANOVA, which provides the result instantly.<br />This is the table of revenue classified by year and month:<br /><br />$$\begin{tabular}{|c||ccccc||r|}\hline Months & Year 1 & Year 2 & Year 3 & Year 4 & Year 5\\\hline January&15&18&22&23&24\\ February&22&25&15&15&14\\ March&18&22&15&19&21\\ April&23&15&14&17&18\\ May&23&15&26&18&14\\ June&12&15&11&10&8\\ July&26&12&23&15&18\\ August&19&17&15&20&10\\ September&15&14&18&19&20\\ October&14&18&10&12&23\\ November&14&22&19&17&11\\ December&21&23&11&18&14\\ \hline \end{tabular}$$<br /><br />As with the one-way ANOVA, even here the aim is to structure a <b>Fisher's F-test</b> to assess the significance of the variable "month" and of the variable "year", determine if the revenues are dependent on one (or both) the criteria for classification.<br /><em>How to perform the two-way ANOVA in R</em>? First creates an array containing all the values tabulated, transcribed by rows:<br /><br /><pre name="code" class="java"><br />revenue = c(15,18,22,23,24, 22,25,15,15,14, 18,22,15,19,21, <br /> 23,15,14,17,18, 23,15,26,18,14, 12,15,11,10,8, 26,12,23,15,18, <br /> 19,17,15,20,10, 15,14,18,19,20, 14,18,10,12,23, 14,22,19,17,11, <br /> 21,23,11,18,14)</pre><br /><br />According to the months, you create a factor of 12 levels (the number of rows) with 5 repetitions (the number columns) in this manner:<br /><br /><pre name="code" class="java"><br />months = gl(12,5)</pre><br /><br />According to the years you create a factor with 5 levels (the number of column) and 1 recurrence, declaring the total number of observations (the length of the vector revenue):<br /><br /><pre name="code" class="java"><br />years = gl(5, 1, length(entrate))</pre><br /><br />Now you can fit the linear model and produce the ANOVA table: <br /><br /><pre name="code" class="java"><br />fit = aov(revenue ~ months + years)<br /><br />anova(fit)<br /><br />Analysis of Variance Table<br /><br />Response: revenue<br /> Df Sum Sq Mean Sq F value Pr(>F)<br />months 11 308.45 28.04 1.4998 0.1660<br />years 4 44.17 11.04 0.5906 0.6712<br />Residuals 44 822.63 18.70 </pre><br /><br />Now interpret the results.<br />The significance of the difference between months is: <i>F = 1.4998</i>. This value is lower than the value tabulated and indeed <i>p-value > 0.05</i>. So we accept the null hypothesis: the means of revenue evaluated according to the months are equal, then the variable "months" has no effect on revenue.<br /><br />The significance of the difference between years is: <i>F = 0.5906</i>. This value is lower than the value tabulated and indeed <i>p-value > 0.05</i>. So we accept the null hypothesis: the means of revenue evaluated according to the years are equal, then the variable "years" has no effect on revenue.<br /></span>Todos Logoshttp://www.blogger.com/profile/09881188152777475558noreply@blogger.com2tag:blogger.com,1999:blog-4274823366855967619.post-46103757551455481452009-08-06T07:30:00.003+02:002009-08-06T10:53:25.695+02:00Simple linear regressionWe use the <b>regression analysis</b> when, from the data sample, we want to derive a statistical model that predicts the values of a variable (Y, <em>dependent</em>) from the values of another variable (X, <em>independent</em>). The <b>linear regression</b>, which is the simplest and most frequent relationship between two quantitative variables, can be <em>positive</em> (when X increase, Y increase too) or <em>negative</em> (when X increase, Y decrease): this is indicated by the sign of the coefficient <code style="color: rgb(153, 0, 0);">b</code>.<br /><br /><span class="fullpost">To build the line that describes the distribution of points, we might refer to different principles. The most common is the <b>least squares method</b> (or <em>Model I</em>), and this is the method used by the statistical software R.<br /><br />Suppose you want to obtain a linear relationship between weight (kg) and height (cm) of 10 subjects.<br /><div style="text-align: center;">Height: 175, 168, 170, 171, 169, 165, 165, 160, 180, 186<br />Weight: 80, 68, 72, 75, 70, 65, 62, 60, 85, 90</div><br /><br />The first problem is to decide what is the dependent variable Y and waht is the independent variable X. In general, the independent variable is not affected by an error during the measurement (or affected by random error), while the dependent variable is affected by error. In our case we can assume that the variable weight is the independent variable (X), and the dependent variable height (Y).<br />So our problem is to find a linear relationship (formula) that allows us to calculate the height, known as the weight of an individual. The simplest formula is that of a broad line of type <code style="color: rgb(153, 0, 0);">Y = a + bX</code>. The simple regression line in R is calculated as follows:<br /><br /><pre name="code" class="java"><br />height = c(175, 168, 170, 171, 169, 165, 165, 160, 180, 186)<br />weight = c(80, 68, 72, 75, 70, 65, 62, 60, 85, 90)<br /> <br />model = lm(formula = height ~ weight, x=TRUE, y=TRUE)<br />model<br /><br />Call:<br />lm(formula = height ~ weight, x = TRUE, y = TRUE)<br /><br />Coefficients:<br />(Intercept) weight <br /> 115.2002 0.7662 </pre><br /><br />The correct syntax of the formula stated in lm is: Y ~ X, then you declare first the dependent variable, and after the independent variable (or variables).<br />The output of the function is represented by two parameters <b>a</b> and <b>b</b>: <code style="color: rgb(153, 0, 0);">a=115.2002</code> (intercept), <code style="color: rgb(153, 0, 0);">b=0.7662</code> (the slope).<br /><br /><center><hr width="50%"></center><br /><br />The simple calculation of the line is not enough. We must assess the significance of the line, ie if the slope<code style="color: rgb(153, 0, 0);">b</code> differs from zero significantly. This may be done with a <b>Student's t.test</b> or with a <b>Fisher's F-test</b>.<br />In R both can be retrieved very quickly, with the function summary(). Here's how:<br /><br /><pre name="code" class="java"><br />model <- lm(height ~ weight)<br />summary(model)<br /><br />Call:<br />lm(formula = height ~ weight)<br /><br />Residuals:<br /> Min 1Q Median 3Q Max <br />-1.6622 -0.9683 -0.1622 0.5679 2.2979 <br /><br />Coefficients:<br /> Estimate Std. Error t value Pr(>|t|) <br />(Intercept) 115.20021 3.48450 33.06 7.64e-10 ***<br />weight 0.76616 0.04754 16.12 2.21e-07 ***<br />---<br />Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 <br /><br />Residual standard error: 1.405 on 8 degrees of freedom<br />Multiple R-squared: 0.9701, Adjusted R-squared: 0.9664 <br />F-statistic: 259.7 on 1 and 8 DF, p-value: 2.206e-07 </pre><br /><br /><br />Here too there are the values of the parameters <b>a</b> and <b>b</b>.<br />The <b>Student's t-test</b> on the slope in this case has the value 16.12; the <b>Student's t-test</b> on the intercept has value 16.12; the value of the <b>Fisher's F test</b> is 259.7 (is the same value would be achieved by performing an <a href="http://statistic-on-air.blogspot.com/2009/07/analysis-of-variance-anova-for-multiple.html">ANOVA</a> on the same data: <code style="color: rgb(153, 0, 0);">anova(model)</code>). The p-values of the t-tests and the F-test are less then 0.05, so the model we found is significant.<br />The <b>Multiple R-squared</b> is the <b>coefficient of determination</b>. It provides a measure of how well future outcomes are likely to be predicted by the model. In this case, the 97.01% of the data are well predicted (with 95% of significance) by our model.<br /><br />We can plot on a graph the data points and the regression line, in this way:<br /><br /><pre name="code" class="java"><br />plot(weight, height)<br />abline(model)</pre><br /></span>Todos Logoshttp://www.blogger.com/profile/09881188152777475558noreply@blogger.com2tag:blogger.com,1999:blog-4274823366855967619.post-36099716375645061942009-08-05T07:30:00.001+02:002009-08-05T10:38:39.596+02:00Contingency table and the study of the correlation between qualitative variables: Pearson's Chi-squared testIf you have <em>qualitative variable</em>, it is possible to verify the correlation by studying a <b>contingency table R by C</b>, using the <b>Pearson's Chi-squared test</b>.<br /><br />A casino wants to study the correlation between the modes of play and the number of winners by age group, to see if the number of winners depends on the type of game that you chose to do, in light of experience. It has the following data (number of winners / 100 player for game and age-group):<br /><br /><center>$$\begin{tabular}{c|ccc}&Age\\\hline Game&20-30&31-40&41-50\\ \hline Roulette&44&56&55\\ Black-jack& 66& 88& 23\\Poker& 15& 29& 45 \end{tabular}$$</center><br /><br /><span class="fullpost">In R, we must first build a matrix with the data collected:<br /><br /><pre name="code" class="java"><br />table <- matrix(c(44,56,55, 66,88,23, 15,29,45), nrow=3, byrow=TRUE)</pre><br /><br />Now we can compute the chi-squared correlation coefficient:<br /><br /><pre name="code" class="java"><br />chisq.test(table)<br /><br /> Pearson's Chi-squared test<br /><br />data: table <br />X-squared = 46.0767, df = 4, p-value = 2.374e-09</pre><br /><br />I reject the null hypothesis H0 in favor of the alternative hypothesis (<i>p-value < 0.05</i>): there is a strong correlation between the age of the player and his probability to win.</span>Todos Logoshttp://www.blogger.com/profile/09881188152777475558noreply@blogger.com0tag:blogger.com,1999:blog-4274823366855967619.post-90255582534645200732009-08-04T07:30:00.001+02:002009-08-05T10:37:15.533+02:00Non-parametric methods for the study of the correlation: Spearman's rank correlation coefficient and Kendall tau rank correlation coefficientWe saw in the previous post, how to study the correlation between variables that follow a Gaussian distribution with the <a href="http://statistic-on-air.blogspot.com/2009/08/parametric-method-for-study-of.html"> Pearson product-moment correlation coefficient</a>. If it is not possible to assume that the values follow gaussian distributions, we have two non-parametric methods: the <b>Spearman's rho test</b> and <b>Kendall's tau test</b>.<br /><br /><span class="fullpost">For example, you want to study the productivity of various types of machinery and the satisfaction of operators in their use (as with a number from 1 to 10). These are the values:<br /><div style="text-align: center;">Productivity: 5, 7, 9, 9, 8, 6, 4, 8, 7, 7<br />Satisfaction: 6, 7, 4, 4, 8, 7, 3, 9, 5, 8</div><br /><br />Begin to use first the <b><u>Spearman's rank correlation coefficient</b></u>:<br /><br /><pre name="code" class="java"><br />a <- c(5, 7, 9, 9, 8, 6, 4, 8, 7, 7)<br />b <- c(6, 7, 4, 4, 8, 7, 3, 9, 5, 8)<br /><br />cor.test(a, b, method="spearman")<br /><br /> Spearman's rank correlation rho<br /><br />data: a and b <br />S = 145.9805, p-value = 0.7512<br />alternative hypothesis: true rho is not equal to 0 <br />sample estimates:<br /> rho <br />0.1152698 </pre><br /><br />The statistical test gives us as a result <i>rho = 0.115</i>, which indicates a low correlation (not parametric) between the two sets of values.<br />The <i>p-value > 0.05</i> allows us to accept the value of rho calculated, being statistically significant.<br /><br />Now we check the same data with the <b><u>Kendall tau rank correlation coefficient</b></u>:<br /><br /><pre name="code" class="java"><br />a <- c(5, 7, 9, 9, 8, 6, 4, 8, 7, 7)<br />b <- c(6, 7, 4, 4, 8, 7, 3, 9, 5, 8)<br /> <br />cor.test(a, b, method="kendall")<br /><br /> Kendall's rank correlation tau<br /><br />data: a and b <br />z = 0.5555, p-value = 0.5786<br />alternative hypothesis: true tau is not equal to 0 <br />sample estimates:<br /> tau <br />0.146385</pre><br /><br />Even with the Kendall test, the correlation is very low (<i>tau = 0.146</i>), and significant (<i>p-value > 0.05</i>).</span>Todos Logoshttp://www.blogger.com/profile/09881188152777475558noreply@blogger.com2tag:blogger.com,1999:blog-4274823366855967619.post-74615198383143443882009-08-03T09:33:00.003+02:002009-08-05T10:35:18.234+02:00Parametric method for the study of the correlation: the Pearson r-testSuppose you want to study whether there is a correlation between 2 sets of data. To do this we compute the <b>Pearson product-moment correlation coefficient</b>, which is a measure of the correlation (linear dependence) between two variables X and Y; then we compute the value of a t-test to study the significance of the <b>Pearson coefficient R</b>. We can use this test when the data follow a Gaussian distribution.<br /><br /><span class="fullpost">A new test to measure IQ is subjected to 10 volunteers. You want to see if there is a correlation between the new experimental test and the classical test, in order to replace the old test with the new test. These the values:<br /><div style="text-align: center;">Old test: 15, 21, 25, 26, 30, 30, 22, 29, 19, 16<br />New test: 55, 56, 89, 67, 84, 89, 99, 62, 83, 88</div><br /><br />The software R has a single function, easily recalled, which gives us directly the value of the Pearson coefficient and the t-statistical test for checking the significance of the coefficient:<br /><br /><pre name="code" class="java"><br />a = c(15, 21, 25, 26, 30, 30, 22, 29, 19, 16)<br />b = c(55, 56, 89, 67, 84, 89, 99, 62, 83, 88)<br /><br />cor.test(a, b)<br /><br /> Pearson's product-moment correlation<br /><br />data: a and b <br />t = 0.4772, df = 8, p-value = 0.646<br />alternative hypothesis: true correlation is not equal to 0 <br />95 percent confidence interval:<br /> -0.5174766 0.7205107 <br />sample estimates:<br /> cor <br />0.166349</pre><br /><br />The value of the coefficient of Pearson is 0.166: it is a very low value, which indicates a poor correlation between the variables. <br />Furthermore, the p-value is greater than 0.05; so we reject the null hypothesis: then the Pearson coefficient is significant.<br />So we can say that there is no correlation between the results of both tests.</span>Todos Logoshttp://www.blogger.com/profile/09881188152777475558noreply@blogger.com3tag:blogger.com,1999:blog-4274823366855967619.post-14291171929572685562009-07-31T10:46:00.001+02:002009-08-05T10:29:46.028+02:00Kruskal-Wallis one-way analysis of varianceIf you have to perform the comparison between multiple groups, but you can not run a <a href="http://statisticaconr.blogspot.com/2009/07/confronto-tra-piu-gruppi-metodo.html">ANOVA for multiple comparisons</a> because the groups do not follow a normal distribution, you can use the <b>Kruskal-Wallis test</b>, which can be applied when you can not make the assumption that the groups follow a gaussian distribution.<br />This test is similar to the Wilcoxon test for 2 samples.<br /><br />Suppose you want to see if the means of the following 4 sets of values are statistically similar:<br /><div style="text-align: center;">Group A: 1, 5, 8, 17, 16<br />Group B: 2, 16, 5, 7, 4<br />Group C: 1, 1, 3, 7, 9<br />Group D: 2, 15, 2, 9, 7</div><br /><br /><span class="fullpost">To use the test of Kruskal-Wallis simply enter the data, and then organize them into a list:<br /><br /><pre name="code" class="java"><br />a = c(1, 5, 8, 17, 16)<br />b = c(2, 16, 5, 7, 4)<br />c = c(1, 1, 3, 7, 9)<br />d = c(2, 15, 2, 9, 7)<br /><br />dati = list(g1=a, g2=b, g3=c, g4=d)</pre><br /><br />Now we can apply the <code style="color: rgb(153, 0, 0);">kruskal.test()</code> function:<br /><br /><pre name="code" class="java"><br />kruskal.test(dati)<br /><br /> Kruskal-Wallis rank sum test<br /><br />data: dati <br />Kruskal-Wallis chi-squared = 1.9217, df = 3, p-value = 0.5888</pre><br /><br />The value of the test statistic is 1.9217. This value already contains the fix when there are ties (repetitions). The p-value is greater than 0.05; also the value of the test statistic is lower than the chi-square-tabulation:<br /><br /><pre name="code" class="java"><br />qchisq(0.950, 3)<br />[1] 7.814728</pre><br /><br />The conclusion is therefore that I accept the null hypothesis H0: the means of the 4 groups are statistically equal.</span>Todos Logoshttp://www.blogger.com/profile/09881188152777475558noreply@blogger.com2tag:blogger.com,1999:blog-4274823366855967619.post-33471533468975374132009-07-30T09:32:00.002+02:002009-08-05T10:27:53.533+02:00Analysis of variance: ANOVA, for multiple comparisons<span style="font-weight: bold;">Analysis of variance: ANOVA, for multiple comparisons</span><br /><br />The ANOVA model can be used to compare the mean of several groups with each other, using a parametric method (assuming that the groups follow a Gaussian distribution).<br />Proceed with the following example:<br /><br />The manager of a supermarket chain wants to see if the consumption in kilowatts of 4 stores between them are equal. He collects data at the end of each month for 6 months. The results are:<br /><div style="text-align: center;">Store A: 65, 48, 66, 75, 70, 55<br />Store B: 64, 44, 70, 70, 68, 59<br />Store C: 60, 50, 65, 69, 69, 57<br />Store D: 62, 46, 68, 72, 67, 56</div><br /><br /><span class="fullpost">To proceed with the verification ANOVA, we must first verify the homoskedasticity (ie test for homogeneity of variances). The software R provides two tests: the Bartlett test, and the Fligner-Killeen test.<br /><br /><center><hr width="50%"></center><br /><br />We begin with the <b>Bartlett test</b>.<br /><br />First we create the 4 vectors:<br /><br /><pre name="code" class="java"><br />a = c(65, 48, 66, 75, 70, 55)<br />b = c(64, 44, 70, 70, 68, 59)<br />c = c(60, 50, 65, 69, 69, 57)<br />d = c(62, 46, 68, 72, 67, 56)</pre><br /><br />Now we combine the 4 vectors in a single vector:<br /><br /><pre name="code" class="java"><br />dati = c(a, b, c, d)</pre><br /><br />Now, on this vector in which are stored all the data, we create the 4 levels:<br /><br /><pre name="code" class="java"><br />groups = factor(rep(letters[1:4], each = 6))</pre><br /><br />We can observe the contents of the vector <code style="color: rgb(153, 0, 0);">groups</code> simply by typing <code style="color: rgb(153, 0, 0);">groups + [enter]</code>.<br /><br />At this point we start the Bartlett test:<br /><br /><pre name="code" class="java"><br />bartlett.test(dati, groups)<br /><br /> Bartlett test of homogeneity of variances<br /><br />data: dati and groups <br />Bartlett's K-squared = 0.4822, df = 3, p-value = 0.9228</pre><br /><br />The function gave us the value of the statistical tests (<b>K squared</b>), and the p-value. Can be argued that the variances are homogeneous since <i>p-value > 0.05</i>. Alternatively, we can compare the Bartlett's K-squared with the value of chi-square-tables; we compute that value, assigning the value of alpha and degrees of freedom at the <code style="color: rgb(153, 0, 0);">qchisq</code> function:<br /><br /><pre name="code" class="java"><br />qchisq(0.950, 3)<br />[1] 7.814728</pre><br /><br />Chi-squared > Bartlett's K-squared: we accept the null hypothesis H0 (variances homogeneity)<br /><br /><center><hr width="50%"></center><br /><br />We try now to check the homoskedasticity, with the <b>Fligner-Killeen test</b>.<br />The syntax is quite similar, and then proceed quickly.<br /><br /><pre name="code" class="java"><br />a = c(65, 48, 66, 75, 70, 55)<br />b = c(64, 44, 70, 70, 68, 59)<br />c = c(60, 50, 65, 69, 69, 57)<br />d = c(62, 46, 68, 72, 67, 56)<br /><br />dati = c(a, b, c, d)<br /><br />groups = factor(rep(letters[1:4], each = 6))<br /><br />fligner.test(dati, groups)<br /><br /> Fligner-Killeen test of homogeneity of variances<br /><br />data: dati and groups <br />Fligner-Killeen:med chi-squared = 0.1316, df = 3, p-value = 0.9878</pre><br /><br />The conclusions are similar to those for the test of Bartlett.<br /><br /><center><hr width="50%"></center><br /><br />Having verified the homoskedasticity of the 4 groups, we can proceed with the <b>ANOVA model</b>.<br /><br />First organize the values, fitting the model:<br /><br /><pre name="code" class="java"><br />fit = lm(formula = dati ~ groups)</pre><br /><br />Then we analyze the ANOVA model:<br /><br /><pre name="code" class="java"><br />anova (fit)<br /><br />Analysis of Variance Table<br /><br />Response: dati<br /> Df Sum Sq Mean Sq F value Pr(>F)<br />groups 3 8.46 2.82 0.0327 0.9918<br />Residuals 20 1726.50 86.33 </pre><br /><br />The output of the function is a classical ANOVA table with the following data:<br /><code style="color: rgb(153, 0, 0);">Df</code> = degree of freedom<br /><code style="color: rgb(153, 0, 0);">Sum Sq</code> = deviance (within groups, and residual)<br /><code style="color: rgb(153, 0, 0);">Mean Sq</code> = variance (within groups, and residual)<br /><code style="color: rgb(153, 0, 0);">F value</code> = the value of the Fisher statistic test, so computed (variance within groups) / (variance residual)<br /><code style="color: rgb(153, 0, 0);">Pr(>F)</code> = p-value<br /><br />Since <i>p-value > 0.05</i>, we accept the null hypothesis H0: the four means are statistically equal. We can also compare the computed F-value with the tabulated F-value:<br /><br /><pre name="code" class="java">qf(0.950, 20, 3)<br />[1] 8.66019</pre><br /><br />Tabulated F-value > computed F-value: we accept the null hyptohesis.</span>Todos Logoshttp://www.blogger.com/profile/09881188152777475558noreply@blogger.com12