Final Blog (Blog V) MIS 587

When we started our work on this class 8 weeks ago, I was apprehensive and also curious to learn about big data and business intelligence. My prior knowledge was limited and upon self reflection, I was apprehensive about the idea of big data collecting, measuring and inferring information on social media likes, tweets, heart rate data, virology, shopping habits, etc. I tend to approach and review technology with an information security mindset.  However, after completing the class and reviewing the different ways in which big data can be used and seeing first-hand  the anonymization of such data, I now realize how such data can be a boon and expands the ways in which we can analyze data. As we learned in week 4 with dashboard designs and readings such as the Top 5 Most Influential visualizations of all time, visualizing data can tell a more full and rich story and inputting crucial data into a dashboard in a one page display can help get quick big picture gleanings from a quick glance at the page. 


Throughout the class we saw examples of visualizing data, whether it be using a dashboard via a dashboard tool such as Tableau, a star schema for organizing fact table(s) with dimension table(s), Google Analytics to reveal and interpret various web metrics of a website, or a network visualization to visually show the nodes and edges of a network. Star schemas and data dimension tables have fact tables and dimension tables for representing a business process. When building a star schema, it’s important to take your time on the initial process of selecting and defining the business process to model. The entire star schema could be of little value if the business process is not accurately determined. Everyone needs to be on the same page about what it means to do xyz.. One person’s understanding of what the business process is could be wildly different from another person who actually performs the business process. For that reason, it is important to conduct interviews and make sure that all are on the same page and are in agreement of what actually the business process is that will be measured. 


From there, you can work your way out and determine the grain of the business process. In other words, determine the items which will be in the fact table. It is best to determine the item on the atomic level. For example, if one is creating a star schema on a grocery store, a starting point on the fact table would be a single line item of a purchase receipt. You would want to figure out how to further describe that item in the fact table using things like quantity sold and dollar amount of sale .So quantity sold and the sales dollar amount will be a part of the fact table. From there, you would start adding dimensions for dimension table(s) such as product, date of sale, and if there are multiple grocery stores, store location. 


In my profession of financial aid, students are the customer. Breaking down things to the most granular level would be a financial award item type such as the Federal Pell Grant. Other facts would be the Student’s EFC and the Pell Grant award amount. Dimensions would include date of disbursement, award type (such as Federal Pell Grant or Federal Direct Unsubsidized Loan), year in school, etc. I bring this up because I found myself thinking about the things learned in class and applying them to my job. 


Another job application example besides Star Schemas includes balanced scorecards.After learning the structure for building balanced scorecards, During a one on one meeting, my supervisor expressed the goal that she and another director share of being the best financial aid office in the country. I suggested the balanced scorecard approach and used their stated goal of being the best financial aid office in the country as the vision and then shared with her the four quadrants: financial - how should we appear to the Arizona Board of Regents?, customer - how should we appear to students?, internal business process - to satisfy students, what business processes must we excel at?, and learning and growth - how will we sustain change and improve?


I’d like to share just one more example. After reading about data warehouses and OLTP vs OLAP, I found myself better understanding our main database functionalities at work. Most of my daily functions and processes are run in OLTP but there are times when I have to go to other environments to pull data and I always wondered why. OLAP better explains why the separation exists and why I have to pull data from a separate database entity. 


I found the readings on dashboards and dashboard design to be informative. The assignment on building a dashboard using bird strike data on aircraft was a challenging exercise to both infer valuable data from the dataset and also design a useful dashboard from that data. I followed recommendations found in the readings to go with practical dashboard applications over flashy applications and I thought about ways in which I could present the data that would be helpful to the FAA. I tried to put myself in the shoes of an an FAA official and thought about some of the statistics that I would want to know about such as a running tally of bird strikes per aircraft model, injuries per airline to track potential safety protocols not being followed, damage in relation to elevation, which geographical areas are more prone to bird strikes, number of deaths and injuries per year to track trends and number of bird strikes per year categorized by bird size. 


Analyzing data doesn’t come naturally to me so I had to work at this assignment but I enjoyed the challenge of figuring out how I could use the data and playing around with tables in the x and y axis as well as adding filters. Once I had the data, I found it another unique challenge to piece all of the information together into a cohesive one page layout. I played around with widget sizes and held true to my goals of presenting data visualizations to answer the specific questions addressed. I’ve identified a real world job application using Tableau to visualize data on professional judgment appeals in my profession. I have started putting together dashboard for tracking the number of professional judgment appeals year over year, tracking the quantity of appeals from year, and the outcome of the appeals. In layman's terms, a professional judgment appeal is when a financial aid professional can utilize their own professional judgment to change data elements found within the financial aid application in order to better show the student’s ability to pay now. 


I was most excited to learn about web metrics and how to leverage metrics to optimize google searches. Using Google’s Merchandise Store for assignment III was helpful in showing the robust features of Google Analytics. I noticed that there were many plug and play analyses available such as default channel groups and conversion funnels. I suspect that the plug and play analyses aren’t as seamless when using websites out in the real world away from the Google Merchandise Store. However, I am eager to use Google Analytics to analyze a particular retail website and a non-profit website. 


In particular, I’d like to perform a conversion funnel on the retail website to see where users abandon the purchase event. Regarding the non-profit site, I’d also like to perform a conversion funnel and see where abandonment occurs for the event of users joining the email list. Lastly, I’d like to explore Google search optimization much further. I found the readings on web metrics to be a great supplement to assignment III and I was intrigued by the different metrics like organic search,direct traffic, paid traffic, etc. 


I found assignment IV and the accompanying topics from lecture 11- 15 to be fascinating in terms of ways to display, interpret, and analyze data. Equating many things found in the world, such as traffic, friend groups, and flight patterns as networks makes perfect sense. Distilling things down to nodes and relationships down to edges helped to simplify the relationship network. From there we can expand and talk about directionality (inwards or outwards) and degree (number of connections). Eigenvector centrality is another interesting metric looking not at the number of connections but instead at the quality of connections. Per our network properties lecture, Google’s search engine technology is similar in concept in that it uses a ranking algorithm. Other helpful measurements include things like closeness centrality and degree centrality which are different variations on a closeness measurement between nodes and the paths (edges) between them. 


For me, module 3 of the class was the payoff where I began to see how inference can be made on big data. Yes we saw many cases of inference in the prior modules, readings, and projects, but for me, seeing large amounts of data in Gephi and seeing that data worked out into meaningful ways helped to illustrate ways in which big data can be further analyzed and interpreted to find deeper meaning. It was very helpful completing assignment IV and seeing the visual representation of twitter data and watching the group connections reveal themselves and using the visual representation to infer more relationships and context along with analyzing the data via scatter plots and pie charts. Thank you for reading my final blog post!


Sincerely,

Josh

Comments

Popular posts from this blog

Module 3 Blog Post

Module 2 Blog Post

Module 1 Blog Post