This week we worked with STATISTICS statics...actually, despite the use of the word statistics, we worked with a GIS tool that uses statistics to help discern if the given data is displayed or distributed correctly.We used three different scary sounding but easy to use tools (geostatistical analyst) to determine how are data related to each other. For this week, we used an ESRI course. Prior to UWF, I have done a few of the more basic (i.e. FREE) ones and they are dull. Somewhere between the monotone voice of the video and the beyond dry writing, those less than 3 hours of training, seemed like a good day's worth. ESRI makes a great software suite, but they need work in the keeping things lively department.
The primary focus of the lesson (I hope) is determining if the data we have is correct in a statistical sense. i.e. does the data relate to each other correctly. What do I mean correctly? I mean, "normally distributed". Normally distributed means data that when displayed falls along a bell curve. The data shows a logical spatial relationship with each other. In determining the bell curve, we used a histogram which displays a bar graph. If the data is mostly normally distributed, then the bars are displayed with the tallest bar in the center and the remainder creating a pyramid esque look on either side. We then learnt to use a QQ Plot which uses a plot line to show if the data falls along said line. If the data is on or near the plot, then the data is "normally distibuted". In our exercise, we determined that there were a few outliers ( i.e. data that wasn't in line with the majority.) My over all take away was that we used these tools and graphs to figure out if the data we were given is good. Outliers may be the result of bad data entry so to speak or a special phenomenon .
In the exercises, we had to use weather stations in western Europe. Of course, temperature readings in high altitude areas (the Alps) were much colder than other areas. Using the QQ plot and Histogram we found an outlier in Switzerland, Most reported temperatures where somewhere in the winter conditions where as this single station was in the 70s. So there was an obvious problem. The whole exercise was essentially learning to use statistics to figure our if our data is good or not. Bad data used can create issues down the line or at least make a GIS technician's life harder.
The map below shows the distribution of temperature across western Europe. You'll notice an X and a cross, these display the median and mean center of the stations. Then the directional distribution shows the direction in which the stations are statistically distributed. I used a color ramp to help display the temperature range better.
No comments:
Post a Comment