Raison D'etre

To help twitter users reach the largest possible audience and increase twitter activity between users

Project Goals

1.Pull data from twitter and determine what tweet characteristics make it "optimum"
      a. Look at the connection between when tweets are made, when followers are tweeting and how likely a tweet is to be retweeted, or replied.
      b. Look at the topics that are trending among followers and how likely a tweet that contains those hashtags is to be retweeted, or replied, by them.
      c. Identify how the style of the tweet affect its efficiency of being retweeted or replied.
            i. Is it better to write a bitly link, a complete link, or is there no correlation with the retweets or replies?
           ii. Does lower or upper case letter have a correlation with the number of followers  that reply or retweet?
          iii. Will a punctuation mark have a correlation with the followers replying or retweeting?
          iv. How does the frequency of tweeting affects the retweet rate? If at all 

2. Provide the user with two data sets, a personal and customizable one, and a generic one, that details our findings from the data mining.
       a. Personal Data Set: Provide the user with a dynamic set of information that shows statistics regarding topics, time of day and time it takes to retweet or reply, tweeting/retweeting ratio,and style of their friends' tweets, and what their friends have been retweeting and replying to.
       b. Generic Date Set: Provide the user a broader set of information that includes statistics from our initial results that look for trends, times of answer, and styles that are trending in the world. This data set will provide general information regarding effective tweet techniques and will not be limited to friend's trends.

3. Define what makes a tool useful and appealing to a user.

4. Develop a user-friendly tool to optimize tweets.

5. Refine and distribute tool.

Expected Results


  1. Create a tool that gives information to the user before the tweet is sent, in order to optimize the efficiency of the tweet.
  2. Find two relevant data sets and provide them to the user as a dashboard: one personal, and one generic; this will serve to help the user identify what characteristics make a tweet more appealing and efficient to increase retweets and replies.
  3. The personal date set will have customized information regarding their friend's tweets and their statistics regarding their friend's overall activity.
  4. The generic data set will provide information about more users than just the direct friends.
  5. Relevant metrics that would be useful for the user and change the way people tweet.

Possible Problems

  1. Having problems finding a data set large enough to get relevant results.
  2. Finding no correlation between any of our proposed metrics and the data.
  3. Developing an effective tool so that the user would find it useful and not confusing.

Project Deadlines

May 9
  • Write program to access and store data from twitter
  • Get data from twitter, 20000+ tweets and start the analysis
May 11
  • Submit a Progress Reports with updates about our project status
  • Identify new possible metrics that could optimize tweets
May 14
  • Start data mining
  • Start breaking down our results into generic and dynamic information sets.
May 18
  • Submit Initial Results to TA an obtain feedback.
  • Brainstorm, after having the initial results, what new metrics could help optimize tweets.
  • Start developing the tool which will provide the information to the user
May 21
  • Finish the data analysis and write down our findings.
  • Finish developing the dashboard that the user will see before tweeting and try it out.
May 24
  • Try out our new tool and ask for feedback from other classmates and friends.
  • Work on the presentation our project and finish up the details
May 25
  • Final presentation of our project.