hh.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Categorizing drugs and explicit Websites using text mining: Text Based Categorization
Halmstad University.
Halmstad University.
2018 (English)Independent thesis Advanced level (degree of Master (One Year)), 10 credits / 15 HE creditsStudent thesis
Abstract [en]

This report is a result of master thesis in network forensics at Halmstad University during spring term 2018.

The focal point of the thesis was to analyse, categorize the websites into few categories i.e., drugs, explicit content and others based on text mining.

In today’s world, there is huge amount of data available on the world wide web. With the increase of the internet users, there is also an increase in websites dramatically. The data consists of different content related to education or illegal activities. Website categorization is used to categorize websites from unorganized data and the main purpose of website categorization, it is used to place an extensive number of websites into appropriate categories and/or manage security personnel to manage the user activity. Why using text? There is a hitch while categorizing websites using the image(s). There are many images that might fall into various categories, for example, an image of white powder. It might be baby powder or cocaine or something else. The idea of website categorization using text mining is to avoid such a problem. Through this paper, we generate a method to better categorize these websites.

Place, publisher, year, edition, pages
2018. , p. 41
Keywords [en]
Drugs, explicit content, machine learning, website categorization, text mining.
National Category
Engineering and Technology
Identifiers
URN: urn:nbn:se:hh:diva-38088OAI: oai:DiVA.org:hh-38088DiVA, id: diva2:1252308
Subject / course
Digital Forensics
Educational program
Master's Programme in Network Forensics, 60 credits
Supervisors
Examiners
Available from: 2018-10-02 Created: 2018-10-01 Last updated: 2018-10-02Bibliographically approved

Open Access in DiVA

fulltext(2325 kB)393 downloads
File information
File name FULLTEXT02.pdfFile size 2325 kBChecksum SHA-512
7ef964da6a70d4b6f6b045bcc655646ec0c9a575a6e12936e259e36c59f572752dcc14399f2180feb98fc5946c6fd3c8bd760ed8acf69a13ca6bd99572483ff8
Type fulltextMimetype application/pdf

By organisation
Halmstad University
Engineering and Technology

Search outside of DiVA

GoogleGoogle Scholar
Total: 393 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 286 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf