ThreeTextBlocks
Your instructor has given you three blocks of text. Your task is to characterize the text and attempt to determine which two blocks were written by the same authors. Fortunately, this is a programming class, so your task here is to simply characterize the 3 text passages. Use three methods to characterize the text, minimum. The code should clearly indicate which methods you are using. Print your metrics out for each text block. Attach there metrics to comments in your main class and turn in all classes.
Don't kill yourself picking 3 methods - this isn't a class in natural language processing (NLP). Some example methods: 1) Count the total number of words in each block of text (yes, yes, not a very useful characterization, but see the first sentence above). 2) Word Frequencies - count the number of occurrences of each word used in each block (the presumption is the same author will choose the same words). 3) Count the number of sentences and average the number of words per sentence.
This type of activity is EXTREMELY popular in many areas (search, DB, AI, NLP, etc). Information Science is built around this. There are many other methods out there. You may want to check wikipedia "string metric" for some other ideas. Often the code is out there as well.
=============
Text1: The Internet has vastly become one of the most prominent ways of communicating amongst one another within the last couple of decades. By utilizing tools such as VOIP, social media, emailing, instant messenger, etc., humans from all over the world have been able to share information within seconds with a mere stroke of a key or a click of a mouse. As far as personal use of The Internet’s communicative inhibitors go, I have used its services to communicate with others through mainly VOIP and email. However, on occasion I have also used tools such as Skype and Facebook. I have found that by not adhering to the advances in communicating via The Internet, I am restricting my ability to maintain a healthy relationship amongst the individuals whom I consider to be important in my life. This reliance upon The Internet’s service is both concerning and fascinating at the same time. However, no matter what one may feel about the role that The Internet plays in our everyday lives, it is no secret that its advances are imminent. Of course one cannot be certain as to how far The Internet’s evolution will go.
Text2: Search engine: I like to use safari, and Google chrome. I use safari because I think it is a little more secure comparing others due to my protection with apple. But for storing my data I prefer store them in Google doc instead of usb or other devices. I find Google doc more reliable. I used to simply type the keywords on browser that I wanted to find, but now after taking this course and the lab that we take I learned how to classified the searching tools. I like the advance search in Google. I’m going to practice it more often.
Text3: Communication: I use Internet for communication in many ways. As I mention earlier, I use it to call my family and friends. For example, I use oovoo.com or skype.com for video calling. Also I use magic jack for my phone calling. The good point about magic jack is that for instance if I go out of country, I can still answer my income calls for free because magic jack works over internet. It never charges me. Also I can call any land phone in US from outside of the country. On the other hand I use Google tools such as Google voice to forward and manage all the greeting income and outcome calling.