Detect Plagiarism in files using Python With Source Code

Introduction

Hello Curious Coders,
In this project we are going to discuss how to check the plagiarism between the
contents present in two different text files. In python this can be done using
predefined package difflib but we are going to check it manually. Let’s get into it….

Source Code :

Get Discount on Top Educational Courses

Brand NameDiscount InformationCoupon Codes Link
Educative.io20% discount on Educative courses and plans
W3Schools20% discount on W3Schools courses
KodeKloud10% discount on KodeKloud courses and plans
GeeksforGeeks30% discount on GeeksforGeeks courses
Target Test Prep20% discount on Target Test Prep
Coding Ninjas₹5000 discount on Coding Ninjas courses
Skillshare40% discount on Skillshare
DataCamp50% discount on DataCamp
365 Data Science57% discount on 365 Data Science Plans
Get SmarterFlat 20% discount on Get Smarter courses
SmartKeedaFlat 40% discount on SmartKeeda courses
StackSocial20% discount on StackSocial courses
				
					# import required library
from tkinter import *

# First we need to read the contents of two files
with open('File_1.txt') as f1:
    s1=f1.read().lower().split()
    l1=[]
    for i in s1:
        if i.isalnum():
            l1.append(i)
with open('File_2.txt') as f2:
    s2=f2.read().lower().split()
    l2=[]
    for i in s2:
        if i.isalnum():
            l2.append(i)

# Finding how many words are common in two files
plag_words=len(set(l1).intersection(set(l2)))

# Finding total number of words in two files
total_words=len(l1)+len(l2)

# Formula to calculate the plagarism percent
plag_percent=100-round((total_words-plag_words*2)/total_words*100)

# Displaying the result
result="      The Plagarized Content Percent among two files is "+str(plag_percent)+"%"
if plag_percent30 and plag_percent<=60:
    win= Tk()
    win.geometry("800x200")
    canvas= Canvas(win, width= 700, height= 650, bg="Yellow")
    canvas.create_text(300, 100, text=result, fill="black", font=('Helvetica 15 bold'))
    canvas.pack()
    win.mainloop()
else:
    win= Tk()
    win.geometry("800x200")
    canvas= Canvas(win, width= 700, height= 650, bg="Red")
    canvas.create_text(300, 100, text=result, fill="black", font=('Helvetica 15 bold'))
    canvas.pack()
    win.mainloop()
				
			

Code Explanation :

1. First we read the contents of two files using open() method
2. We read the contents of lines using read() function which return as string.
So we splitted that string into list of words by excluding punctuation marks(,).
3. We extacted the common words from two lists by conveting them to sets.
4. Next we calcualted the length of plagarised words and total words present in two files.
5. Finally we applied a formual there to calculate the plagarised content from two files and printed it on the screen

Output :

Detect plagiarism in files using Python

Find More Projects

e-commerce management system in java Introduction The e-commerce management system is a GUI-based desktop application designed using Java swing in Netbean IDE. …

time table generator in java introduction The Time Table Generator is a Java utility that helps educational institutions automatically create class schedules …

crime record management system in java introduction The Crime Record Management System is a secure and systematic way of maintaining criminal and …

car rental system in java(GUI swing) introduction The Car Rental System is a Java application tailored for vehicle rental agencies. It allows …

food delivery management system in java introduction This Food Delivery Management System helps restaurants manage customer orders, menus, deliveries, and billing using …

online course registration in java introduction The Online Course Registration System allows students to enroll in courses using a Java application with …

Get Huge Discounts
More Python Projects