You need to analyze textual log data from an online chat forum related to the

Anonymous hacktivist group. You will learn how to apply regular expressions, summarize log data,

quantify text data, and summarize time trends.


IRC is an early protocol for instant messaging developed in the early years of the Internet. The

openness and ability to remain anonymous has made IRC a popular channel for hacker networks to

collaborate and share ideas.

The data comes from [login to view URL] It contains two years

of chats between hackers associated with the hacktivist group Anonymous. In these logs they share

information about malware, setting up servers to deploy attacks, and other information related to

hacking systems.

The collection and analysis of these chats is a form of cyber-threat intelligence. The analysis of these

chats and other dark web data sources enable proactive defense against attacks.


1. Many users log in and view the chat without commenting. Which users spent the most time

in the logs? (3pts) Which users logged in the most (2pts)

2. Find the most common words (3 pts)

3. Count the total number of written messages (only those with actual text content) (2 pts).

Summarize the users that posted the most messages (2pts)

4. Find and rank (by count) words not in an English dictionary (3 pts). This is a simple method

that can identify some names of malware tools

5. Which hours of the day had the most messages (2pts)? Which days had the most traffic (or

messages) (2pts)?

6. Find and list the URLs posted in the chat. (2pts)


This analysis portion of the assignment is graded out of 10 points. The maximum score for analysis

is 15 points.

Your code should also be well-documented with comments, sources, and explanations of what is

happening. Fully documented code will receive full credit. Mostly complete documentation will

receive a deduction of a point, minimal documentation will result in a deduction of 2 points, and no

documentation will result in a deduction of 3 points from your score.


Submit your code and accompanying documentation and evidence that your program works in a PDF

or Word document. The instructor may wish to see a demonstration of your code.


<+evilbot> This user is a bot. If possible, filter this user’s posts from the chat

You can identify changes in days with the messages “--- Day changed Mon Sep 26 2016”. There are

some instances of this measure missing. It is possible to correct this issue by looking at the times of

the day (i.e. the hour rolls over to 00).

Users can change their usernames. An alternative to usernames for login-logout behavior is to use

their login identifiers (for example: [androirc@[login to view URL]]).

Skills: Computer Security, Internet Security, Linux, PHP, Web Security

See more: gdpr live chat, gdpr and instant messaging, chat log dataset, chat dataset download, gdpr compliant live chat, gdpr chat logs, freecodecamp chat dataset, gdpr chat messages

About the Employer:
( 4 reviews ) Kanpur, India

Project ID: #19584513

2 freelancers are bidding on average ₹8858 for this job


Invaders is a group of independent professional writers and website developer and scholars from everywhere throughout the world. Together, we give best quality aid to secondary school, undergraduate, Masters and Ph.d More

₹1050 INR in 7 days
(31 Reviews)

Hello I have walked through your note and enough confidence that I can work on your project I am having 10 years of rich experience as Mobile & Web Developer and also know graphics designing means in my career i learn More

₹16666 INR in 5 days
(16 Reviews)