Experimenting with data over a span of time is something I've done with ggplot2 for quite some time; one such example was during the previous year when I wrote a series of functions to create visualizations of the price of certain videogames over the course of a decade. For this assignment, I opted to go for something more simple: a line graph showing the rate of employment from the late 1960's up to the mid-2010's using data from the economics dataset. ggplot ( economics , aes ( x = date , y = unemploy ) ) + geom_line ( ) + geom_smooth ( ) + labs ( title = "Time Series Plot of Unemployment with Smooth Trend Line" , x = "Date" , y = "Unemployment" ) + theme_minimal ( ) The addition of a trend line makes the data presented here more easily understandable to onlookers, with the main message of the data being made clear in the presentation, that being that unemployment in recent times has been steadily incre...
After some time, I'm thrilled to finally release the source code for my first ever R package. Alongside the first three functions I described in the original description file, I've also added an additional interactive_heatmap function that can create heatmaps of correlating numerical data in a given dataset. It's my hope that this package will make exploratory analysis easier for those dealing with unfamiliar sets of data, leaving more time to figure out more advanced uses for said data. GitHub repository for DescribeR: https://github.com/Retrolovania/DescribeR
This week's assignment mainly deals with practicing working with more advanced forms of managing and working with data. 1. The varying sizes of each group causes the period and treatment tests order-dependent. In more layman's terms, the results of the tests change depending on how the values for both period and treat are arranged in the equation. The ANOVA results are somewhat inaccurate as a result of this variance. 2. The only singularity present in these model tests occurs in the last test (z ~ b * (x+y)). In this model, both x and y are proportional with each instance of B. R is incapable of recognizing this, resulting in coefficient values of N/A. The main issue with this is that R is not able to detect a singularity if a main effect (b in this case) is affecting the categorical variables present in the equation.
Comments
Post a Comment