Thursday, June 30, 2016

How to Tell a Story with Data

An excellent visualization, according to Edward Tufte, expresses “complex ideas communicated with clarity, precision and efficiency.” I would add that an excellent visualization also tells a story through the graphical depiction of statistical information. As I discussed in an earlier post, visualization in its educational or conformational role is really a dynamic form of persuasion. Few forms of communication are as persuasive as a compelling narrative. To this end, the visualization needs to tell a story to the audience. Storytelling helps the viewer gain insight from the data. (For a great example, how much do you think steroids have influenced baseball?)
So how does a visual designer tell a story with a visualization? The analysis has to find the story that the data supports. Traditional journalism does this all the time, and journalists have become very good at storytelling with visualization via infographics. In that vein, here are some journalistic strategies on telling a good story that apply to data visualizations as well.
  1. Find the compelling narrative. Along with giving an account of the facts and establishing the connections between them, don’t be boring. You are competing for the viewer’s time and attention, so make sure the narrative has a hook, momentum, or a captivating purpose. Finding the narrative structure will help you decide whether you actually have a story to tell. If you don’t, then perhaps this visualization should support exploratory data analysis (EDA) rather than convey information. However, for the designer of an exploratory visualization it is still important to spark the viewers’ imagination to encourage examining relationships among and facilitate interacting with the data – think gameification.
  2. Think about your audience. What does the audience know about the topic? Is it meant for decision makers, general interested parties, or others? The visualization needs to be framed around the level of information the audience already has, correct and incorrect:
    • Novice: first exposure to the subject, but doesn’t want oversimplification
    • Generalist: aware of the topic, but looking for an overview understanding and major themes
    • Managerial: in-depth, actionable understanding of intricacies and interrelationships with access to detail
    • Expert: more exploration and discovery and less storytelling with great detail
    • Executive: only has time to glean the significance and conclusions of weighted probabilities
  3. Be objective and offer balance. A visualization should be devoid of bias. Even if it is arguing to influence, it should be based upon what the data says–not what you want it to say. Tufte found numerous charts that misled viewers about the underlying data, and created a formula to quantify such a misleading graphic called the “Lie Factor.” The Lie Factor is equivalent to the size of the effect shown in the graphic, divided by the size of the effect in the data. Sometimes it is unintentional-a number that is three times bigger than another will be perceived nine times bigger if represented in 3D. There are simple ways to encourage objectivity: labeling to avoid ambiguity, have graphic dimensions match data dimensions, using standardized units, and keeping design elements from compromising the data. Balance can come from alternative representations (multiple clustering’s; confidence intervals instead of lines; changing timelines; alternative color palettes and assignments; variable scaling) of the data in the same visualization. Maintaining objectivity and balance is not a trivial effort and is easily unintentionally violated. Viewers and decision makers will eventually sniff out inconsistencies which in turn will cause the designer to lose trust and credibility, no matter how good the story.
  4. Don’t Censor. Don’t be selective about the data you include or exclude, unless you’re confident you’re giving your audience the best representation of what the data “says”. This selectivity includes using discrete values when the data is continuous; how you deal with missing, outlier and out of range values; arbitrary temporal ranges; capped values, volumes, ranges, and intervals. Viewers will eventually figure that out and lose trust in the visualization (and any others you might produce).
  5. Finally, Edit, Edit, Edit. Also, take care to really try to explain the data, not just decorate it. Don’t fall into “it looks cool” trap, when it might not be the best way explain the data. As journalists and writers know, if you are spending more time editing and improving your visualization than creating it, you are probably doing something right.

Info-graphic Idea 2


Tool to Download GIF File from Twitter

                         https://savedeo.com/

Twitter has currently no option to save a GIF directly in their app. Another problem is that Twitter GIFs are not real GIFs. They are just videos that automatically play in a loop. However, it is possible to download a Twitter GIF and convert it to a real GIF in just two steps.
First, you need to download the GIF as video. SaveDeo makes that possible:
  1. Go to: https://savedeo.com/en/sites/twitter
  2. Copy the direct link to the post in the input box
  3. Click download.
Now you have the GIF in the MP4 video format. Next you need to convert the video to a GIF. Therefore you can useEZGIF:

  1. Go to: http://ezgif.com/video-to-gif
  2. Upload the video you just downloaded.
  3. Set the start time to 0
  4. Set the end time to the length of the video.
  5. Click convert to GIF.
  6. Click save to download the file.

Info graphic Idea



1. Registration with the United Nations.
2. Interview with the United Nations.
3. Refugee status granted by the United Nations.
4. Referral for resettlement in the United States.
The United Nations decides if the person fits the definition of a refugee and whether to refer the person to a country for resettlement. Only the most vulnerable are referred, accounting for fewer than 1 percent of refugees worldwide. Some people spend years waiting in refugee camps.
5. Interview with State Department contractors.
6. First background check.
7. Higher-level background check for some.
8. Another background check.
The refugee’s name is run through law enforcement and intelligence databases for terrorist or criminal history. Some go through a higher-level clearance before they can continue. A third background check was introduced in 2008 for Iraqis but has since been expanded to all refugees ages 14 to 65.
9. First fingerprint screening; photo taken.
10. Second fingerprint screening.
11. Third fingerprint screening.
The refugee’s fingerprints are screened against F.B.I. and Homeland Security databases, which contain watch list information and past immigration encounters, including if the refugee previously applied for a visa at a United States embassy. Fingerprints are also checked against those collected by the Defense Department during operations in Iraq.
12. Case reviewed at United States immigration headquarters.
13. Some cases referred for additional review.
Syrian applicants must undergo these two additional steps. Each is reviewed by a United States Citizenship and Immigration Services refugee specialist. Cases with “national security indicators” are given to the Homeland Security Department’s fraud detection unit.
14. Extensive, in-person interview with Homeland Security officer.
Most of the interviews with Syrians have been done in Jordan and Turkey.
15. Homeland Security approval is required.
If the House bill becomes law, the director of the F.B.I., the Homeland Security secretary and the director of national intelligence would be required to confirm that the applicant poses no threat.
16. Screening for contagious diseases.
17. Cultural orientation class.
18. Matched with an American resettlement agency.
19. Multi-agency security check before leaving for the United States.
Because of the long amount of time between the initial screening and departure, officials conduct a final check before the refugee leaves for the United States.
20. Final security check at an American airport
https://twitter.com/nytgraphics/status/742724473967812608/photo/1

What is an INFOGRAPHIC ?

Info-graphics are visual representations of information, data, or knowledge. Info-graphics can be used to present complex concepts quickly and clearly to the general population.

Excel :Count specific character in a cell



This formula works by using SUBSTITUTE to first remove all of the characters being counted in the source text. Then the length of the text (with the character removed) is subtracted from the length of the original text. The result is the number of characters that were removed with SUBSTITUTE, which is equal to the count of those characters