You have been tasked with building a URL file validator for a web crawler. A web crawler is an application that fetches a web page, extracts the URLs present in that page, and then recursively fetches new pages using the extracted URLs. The end goal of a web crawler is to collect text data, images, or other resources present in order to validate resource URLs or hyperlinks on a page. URL validators can be useful to validate if the extracted URL is a valid resource to fetch. In this scenario, you will build a URL validator that checks for supported protocols and file types.
What you need to do?
1. Writing detailed comments and docstrings
2. Organizing and structuring code for readability
3. URL = :///
Steps for Completion
Task
Create two lists of strings - one list for Protocol called valid_protocols, and one list for storing File extension called valid_ftleinfo . For this take the protocol list should be restricted to http , https and ftp. The file extension list should be hrl. and docx CSV.
Split an input named url, and then use the first element to see whether the protocol of the URL is in valid_protocols. Similarly, check whether the URL contains a valid file_info.
Task
Write the conditions to return a Boolean value of True if the URL is valid, and False if either the Protocol or the File extension is not valid.
main.py х +
1 def validate_url(url):
2 *****Validates the given url passed as string.
3
4 Arguments:
5 url --- String, A valid url should be of form :///
6
7 Protocol = [http, https, ftp]
8 Hostname = string
9 Fileinfo = [.html, .csv, .docx]
10 ***
11 # your code starts here.
12
13
14
15 return # return True if url is valid else False
16
17
18 if
19 name _main__': url input("Enter an Url: ")
20 print(validate_url(url))
21
22
23
24
25

Answers

Answer 1

Answer:

Python Code:

def validate_url(url):

#Creating the list of valid protocols and file name extensions

valid_protocols = ['http', 'https', 'ftp']

valid_fileinfo = ['.html', '.csv', '.docx']

#splitting the url into two parts

url_split = url.split('://')

isProtocolValid = False

isFileValid = False

#iterating over the valid protocols and file names for validity

for x in valid_protocols:

if x in url_split[0]:

isProtocolValid = True

break

for x in valid_fileinfo:

if x in url_split[1]:

isFileValid = True

break

#Returning the result if the URL has both valid protocol and file extension

return (isProtocolValid and isFileValid)

url = input("Enter an URL: ")

print(validate_url(url))

Explanation:

The image of the output code is attached. Hope it helps.

You Have Been Tasked With Building A URL File Validator For A Web Crawler. A Web Crawler Is An Application
You Have Been Tasked With Building A URL File Validator For A Web Crawler. A Web Crawler Is An Application

Related Questions

Write a SELECT statement that displays each product purchased, the category of the product, the number of each product purchased, the maximum discount of each product, and the total price from all orders of the item for each product purchased. Include a final row that gives the same information over all products (a single row that provides cumulative information from all your rows). For the Maximum Discount and Order Total columns, round the values so that they only have two decimal spaces. Your result

Answers

Answer:

y6ou dont have mind

Explanation:

Suppose you are choosing between the following three algorithms:
• Algorithm A solves problems by dividing them into five subproblems of half the size, recursively solving each subproblem, and then combining the solutions in linear time.
• Algorithm B solves problems of size n by recursively solving two subproblems of size n − 1 and then combining the solutions in constant time.
• Algorithm C solves problems of size n by dividing them into nine sub-problems of size n=3, recursively solving each sub-problem, and then combining the solutions in O(n2) time.
What are the running times of each of these algorithms (in big-O notation), and which would you choose?

Answers

Answer:

Algorithm C is chosen

Explanation:

For Algorithm A

T(n) = 5 * T ( n/2 ) + 0(n)

where : a = 5 , b = 2 , ∝ = 1

attached below is the remaining part of the solution

How does it relate
to public domain
and fair use?

Answers

Because the corilateion of the hippo

How was the first computer reprogrammed

Answers

Answer:

the first programs were meticulously written in raw machine code, and everything was built up from there. The idea is called bootstrapping. ... Eventually, someone wrote the first simple assembler in machine code.

Explanation:

Discuss the relationship of culture and trends?

Answers

A good thesis would be something like “Modern culture is heavily influenced by mainstream trends” and just building on it.

Consider the following two relations for Millennium College

STUDENT(StudentID, StudentName,CampusAddress, GPA)
REGISTRATION(StudentID, CourseID, Grade)

Following is a typical query against these Relation:

SELECT Student_T.StudentID,StudentName,
CourseID, Grade
FROM Sudent_T, Registration_T
WHERE Student_T.StudentID =
Registration_T.StudentID
AND GPA> 3.0
ORDER BY StudentName;

Required:
On what attributes should indexes be defined to speed up this Query? PLease give reasons for each selected

Answers

Thryyrryyfyfuguhoojihugtdtddt

Create a recursive procedure named (accumulator oddsum next). The procedure will return the sum of the odd numbers entered from the keyboard. The procedure will read a sequence of numbers from the keyboard, where parameter oddsum will keep track of the sum from the odd numbers entered so far and parameter next will (read) the next number from the keyboard.

Answers

Answer:

Explanation:

The following procedure is written in Python. It takes the next argument, checks if it is an odd number and if so it adds it to oddsum. Then it asks the user for a new number from the keyboard and calls the accumulator procedure/function again using that number. If any even number is passed the function terminates and returns the value of oddsum.

def accumulator(next, oddsum = 0):

   if (next % 2) != 0:

       oddsum += next

       newNext = int(input("Enter new number: "))

       return accumulator(newNext, oddsum)

   else:

       return oddsum

This lab was designed to teach you more about using Scanner to chop up Strings. Lab Description : Take a group of numbers all on the same line and average the numbers. First, total up all of the numbers. Then, take the total and divide that by the number of numbers. Format the average to three decimal places Sample Data : 9 10 5 20 11 22 33 44 55 66 77 4B 52 29 10D 50 29 D 100 90 95 98 100 97 Files Needed :: Average.java AverageRunner.java Sample Output: 9 10 5 20 average = 11.000 11 22 33 44 55 66 77 average = 44.000 48 52 29 100 50 29 average - 51.333 0 average - 0.000 100 90 95 98 100 97 average - 96.667

Answers

Answer:

The program is as follows:

import java.util.*;

public class Main{

public static void main(String[] args) {

    Scanner input = new Scanner(System.in);

 String score;

 System.out.print("Scores: ");

 score = input.nextLine();

 String[] scores_string = score.split(" ");

 double total = 0;

 for(int i = 0; i<scores_string.length;i++){

     total+= Double.parseDouble(scores_string[i]);  }

 double average = total/scores_string.length;

 System.out.format("Average: %.3f", average); }

}

Explanation:

This declares score as string

 String score;

This prompts the user for scores

 System.out.print("Scores: ");

This gets the input from the user

 score = input.nextLine();

This splits the scores into an array

 String[] scores_string = score.split(" ");

This initializes total to 0

 double total = 0;

This iterates through the scores

 for(int i = 0; i<scores_string.length;i++){

This calculates the sum by first converting each entry to double

     total+= Double.parseDouble(scores_string[i]);  }

The average is calculated here

 double average = total/scores_string.length;

This prints the average

 System.out.format("Average: %.3f", average); }

}

A non-profit organization decides to use an accounting software solution designed for non-profits. The solution is hosted on a commercial provider's site but the accounting information suchas the general ledger is stored at the non-profit organization's network. Access to the software application is done through an interface that uses a coneventional web browser. The solution is being used by many other non-profit. Which security structure is likely to be in place:

Answers

Answer:

A firewall protecting the  software at the provider

Explanation:

The security structure that is likely to be in place is :A firewall protecting the  software at the provider

Since the access to the software application is via a conventional web browser, firewalls will be used in order to protect against unauthorized Internet users gaining access into the private networks connected to the Internet,

Earl develops a virus tester which is very good. It detects and repairs all known viruses. He makes the software and its source code available on the web for free and he also publishes an article on it. Jake reads the article and downloads a copy. He figures out how it works, downloads the source code and makes several changes to enhance it. After this, Jake sends Earl a copy of the modified software together with an explanation. Jake then puts a copy on the web, explains what he has done and gives appropriate credit to Earl.
Discuss whether or not you think Earl or Jake has done anything wrong?

Answers

Answer:

They have done nothing wrong because Jake clearly gives credit to Earl for the work that he did

Explanation:

Earl and Jake have nothing done anything wrong because Jake give the proper credit to Earl.

What is a virus tester?

A virus tester is a software that scan virus on computer, if any site try to attack any computer system, the software detect it and stop it and save the computer form virus attack.

Thus, Earl and Jake have nothing done anything wrong because Jake give the proper credit to Earl.

Learn more about virus tester

https://brainly.com/question/26108146

#SPJ2

the
Wnte
that
Program
will accept three
Values of
sides of a triangle
from
User and determine whether Values
carceles, equal atera or sealen
- Outrast of your
a
are for
an​

Answers

Answer:

try asking the question by sending a picture rather than typing

Before hard disk drives were available, computer information was stored on:

Floppy Disks

Cassette Tapes

Punch Cards

All of the Above

Answers

floppy disks . is the answer

To use cache memory main memory are divided into cache lines typically 32 or 64 bytes long.an entire cache line is cached at once what is the advantage of caching an entire line instead of single byte or word at a time?

Answers

Answer:

The advantage of caching an entire line in the main memory instead of a single byte or word at a time is to avoid cache pollution from used data

Explanation:

The major advantage of caching an entire line instead of single byte is to reduce the excess cache from used data and also to take advantage of the principle of spatial locality.

Which of the following transfer rates is the FASTEST?
1,282 Kbps
O 1,480 Mbps
1.24 Gbps
181 Mbps

Answers

Answer:

The correct answer is 1,480 Mbps.

Explanation:

For this you have to know how to convert bytes to different units.

1,000 Bytes (b)  = 1 Kilobyte (Kb)

1,000 Kilobyte (Kb) = 1 Megabyte (Mb)

1,000 Megabytes (Mb) = 1 Gigabyte (Gb)

and Finally, 1,000 Gigabytes (Gb) = 1 Terrabyte (Tb)

Answer:

1,480 Mbps

Explanation:

1,480 Mbps is equal to 1.48 Gbps, and so it is the fastest transfer rate listed because it is greater than 1.24 Gbps. In summary, 1,480 Mbps > 1.24 Gbps > 181 Mbps > 1,282 Kbps.

Which of the following describes a codec? Choose all that apply.
a computer program that saves a digital audio file as a specific audio file format
short for coder-decoder
converts audio files, but does not compress them

Answers

Answer:

A, B

Explanation:

An example of a host-based intrusion detection tool is the tripwire program. This is a file integrity checking tool that scans files and directories on the system on a regular basis and notifies the administrator of any changes. It uses a protected database of cryptographic checksums for each file checked and compares this value with that recomputed on each file as it is scanned. It must be configured with a list of files and directories to check and what changes, if any, are permissible to each. It can allow, for example, log files to have new entries appended, but not for existing entries to be changed. What are the advantages and disadvantages of using such a tool? Consider the problem of determining which files should only change rarely, which files may change more often and how, and which change frequently and hence cannot be checked. Hence consider the amount of work in both the configuration of the program and on the system administrator monitoring the responses generated.

Answers

Answer:

The main problem with such a tool would be resource usage

Explanation:

The main problem with such a tool would be resource usage. Such a tool would need a large amount of CPU power in order to check all of the files on the system thoroughly and at a fast enough speed to finish the process before the next cycle starts. Such a program would also have to allocate a large amount of hard drive space since it would need to temporarily save the original versions of these files in order to compare the current file to the original version and determine whether it changed or not. Depending the amount of files in the system the work on configuring the program may be very extensive since each individual file needs to be analyzed to determine whether or not they need to be verified by the program or not. Monitoring responses may not be so time consuming since the program should only warn about changes that have occurred which may be only 10% of the files on a daily basis or less.

SOMEONE PLS HELP?!?!!

Answers

Answer:

B. To continuously check the state of a condition.

Explanation:

The purpose of an infinite loop with an "if" statement is to constantly check if that condition is true or false. Once it meets the conditions of the "if" statement, the "if" statement will execute whatever code is inside of it.

Example:

//This pseudocode will print "i is even!" every time i is an even number

int i = 0;

while (1 != 0)       //always evaluates to true, meaning it loops forever

  i = i + 1;               // i gets incrementally bigger with each loop

     if ( i % 2 == 0)     //if i is even....

               say ("i is even!"); //print this statement

Which of the following techniques is a direct benefit of using Design Patterns? Please choose all that apply Design patterns help you write code faster by providing a clear idea of how to implement the design. Design patterns encourage more readible and maintainable code by following well-understood solutions. Design patterns provide a common language / vocabulary for programmers. Solutions using design patterns are easier to test

Answers

Answer:

Design patterns help you write code faster by providing a clear idea of how to implement the design

Explanation:

Design patterns help you write code faster by providing a clear idea of how to implement the design. These are basically patterns that have already be implemented by millions of dev teams all over the world and have been tested as efficient solutions to problems that tend to appear often. Using these allows you to simply focus on writing the code instead of having to spend time thinking about the problem and develop a solution. Instead, you simply follow the already developed design pattern and write the code to solve that problem that the design solves.

Animation timing is does not affect the
speed of the presentation
Select one:
True
False​

Answers

Answer:

True

Explanation:

I think it's true.....

4.5.2 For loop: printing a dictionary python

Answers

Answer:

for x, y in thisdict.items():

 print(x, y)

Explanation:

In java I need help on this specific code for this lab.


Problem 1:


Create a Class named Array2D with two instance methods:


public double[] rowAvg(int[][] array)


This method will receive a 2D array of integers, and will return a 1D array of doubles containing the average per row of the 2D array argument.


The method will adjust automatically to different sizes of rectangular 2D arrays.


Example: Using the following array for testing:


int [][] testArray =

{

{ 1, 2, 3, 4, 6},

{ 6, 7, 8, 9, 11},

{11, 12, 13, 14, 16}

};

must yield the following results:


Averages per row 1 : 3.20

Averages per row 2 : 8.20

Averages per row 3 : 13.20

While using this other array:


double[][] testArray =

{

{1, 2},

{4, 5},

{7, 8},

{3, 4}

};


must yield the following results:


Averages per row 1 : 1.50

Averages per row 2 : 4.50

Averages per row 3 : 7.50

Averages per row 4 : 3.50

public double[] colAvg(int[][] array)


This method will receive a 2D array of integers, and will return a 1D array of doubles containing the average per column of the 2D array argument.


The method will adjust automatically to different sizes of rectangular 2D arrays.


Example: Using the following array for testing:


int [][] testArray =

{

{ 1, 2, 3, 4, 6},

{ 6, 7, 8, 9, 11},

{11, 12, 13, 14, 16}

};

must yield the following results:


Averages per column 1: 6.00

Averages per column 2: 7.00

Averages per column 3: 8.00

Averages per column 4: 9.00

Averages per column 5: 11.00

While using this other array:


double[][] testArray =

{

{1, 2},

{4, 5},

{7, 8},

{3, 4}

};


must yield the following results:


Averages per column 1: 3.75

Averages per column 2: 4.75



My code is:


public class ArrayDemo2dd

{


public static void main(String[] args)

{

int [][] testArray1 =

{

{1, 2, 3, 4, 6},

{6, 7, 8, 9, 11},

{11, 12, 13, 14, 16}

};

int[][] testArray2 =

{

{1, 2 },

{4, 5},

{7, 8},

{3,4}

};


// The outer loop drives through the array row by row

// testArray1.length has the number of rows or the array

for (int row =0; row < testArray1.length; row++)

{

double sum =0;

// The inner loop uses the same row, then traverses all the columns of that row.

// testArray1[row].length has the number of columns of each row.

for(int col =0 ; col < testArray1[row].length; col++)

{

// An accumulator adds all the elements of each row

sum = sum + testArray1[row][col];

}

//The average per row is calculated dividing the total by the number of columns

System.out.println(sum/testArray1[row].length);

}

} // end of main()


}// end of class


However, it says there's an error... I'm not sure how to exactly do this type of code... So from my understanding do we convert it?

Answers

Answer:

Explanation:

The following code is written in Java and creates the two methods as requested each returning the desired double[] array with the averages. Both have been tested as seen in the example below and output the correct output as per the example in the question. they simply need to be added to whatever code you want.

public static double[] rowAvg(int[][] array) {

       double result[] = new double[array.length];

     for (int x = 0; x < array.length; x++) {

         double average = 0;

         for (int y = 0; y < array[x].length; y++) {

             average += array[x][y];

         }

         average = average / array[x].length;

         result[x] = average;

     }

     return result;

   }

   public static double[] colAvg(int[][] array) {

       double result[] = new double[array[0].length];

       for (int x = 0; x < array[x].length; x++) {

           double average = 0;

           for (int y = 0; y < array.length; y++) {

               average += array[y][x];

           }

           average = average / array.length;

           result[x] = average;

       }

 

       return result;

   }

Create a list of 30 words, phrases and company names commonly found in phishing messages. Assign a point value to each based on your estimate of its likeliness to be in a phishing message (e.g., one point if it’s somewhat likely, two points if moderately likely, or three points if highly likely). Write a program that scans a file of text for these terms and phrases. For each occurrence of a keyword or phrase within the text file, add the assigned point value to the total points for that word or phrase. For each keyword or phrase found, output one line with the word or phrase, the number of occurrences and the point total. Then show the point total for the entire message. Does your program assign a high point total to some actual phishing e-mails you’ve received? Does it assign a high point total to some legitimate e-mails you’ve received?

Answers

Answer:

Words found in phishing messages are:

Free gift

Promotion

Urgent

Congratulations

Check

Money order

Social security number

Passwords

Investment portfolio

Giveaway

Get out of debt

Ect. this should be a good starting point to figuring out a full list

Complete the problem about Olivia, the social worker, in this problem set. Then determine the telecommunications tool that would best meet Olivia's needs.


PDA

VoIP

facsimile

Internet

Answers

Answer:

PDA is the correct answer to the following answer.

Explanation:

PDA refers for Programmable Digital Assistant, which is a portable organizer that stores contact data, manages calendars, communicates via e-mail, and manages documents and spreadsheets, typically in conjunction with the user's personal computer. Olivia needs a PDA in order to communicate more effectively.

Write a recursive method called permut that accepts two integers n and r as parameters and returns the number of unique permutations of r items from a group of n items. For given values of n and r, this value P(n, r) can be computed as follows:
n!/(n - r)!
For example , permut (7, 4) should return 840.

Answers

Answer:

Following are the code to the given question:

public class Main//defining a class Main

{

static int permut(int n, int r)//defining a method permut that holds two variable

{

return fact(n)/fact(n-r);//use return keyword to return calcuate value

}

static int fact(int n)//defining a fact method as recursive to calculate factorials

{

return n==0?1:n*fact(n-1);//calling the method recursively

}

public static void main(String[] abs)//main function

{

//int n=7,r=4;//defining integer variable

System.out.println(permut(7,4));//use print method to call permut method and print its values

}

}

Output:

840

Explanation:

Following is the explanation for the above code.

Defining a class Main.Inside the class two methods, "permut and fact" were defined, in which the "permut" accepts two values for calculating its permutated value, and the fact is used for calculates factorial values recursively. At the last, the main method is declared, which uses the print method to call "permut" and prints its return values.

Joseph learned in his physics class that centimeter is a smaller unit of length and a hundred centimeters group to form a larger unit of length called a meter. Joseph recollected that in computer science, a bit is the smallest unit of data storage and a group of eight bits forms a larger unit. Which term refers to a group of eight binary digits? A. bit B. byte O C. kilobyte D. megabyte​

Answers

Answer:

byte

Explanation:

A byte is made up of eight binary digits

what materials can I find at home and make a cell phone tower​

Answers

Answer:

you cant

Explanation:

You simply cant make a tower from materials found in a household

Which of the follwing are examples of meta-reasoning?
A. She has been gone long so she must have gone far.
B. Since I usually make the wrong decision and the last two decisions I made were correct, I will reverse my next decision.
C. I am getting tired so I am probably not thinking clearly.
D. I am getting tired so I will probably take a nap.

Answers

Answer: B. Since I usually make the wrong decision and the last two decisions I made were correct, I will reverse my next decision.

C. I am getting tired so I am probably not thinking clearly

Explanation:

Meta-Reasoning simply refers to the processes which monitor how our reasoning progresses and our problem-solving activities and the time and the effort dedicated to such activities are regulated.

From the options given, the examples of meta reasoning are:

B. Since I usually make the wrong decision and the last two decisions I made were correct, I will reverse my next decision.

C. I am getting tired so I am probably not thinking clearly

What is the difference between an information system and a computer application?

Answers

Answer:

An information system is a set of interrelated computer components that collects, processes, stores and provides output of the information for business purposes

A computer application is a computer software program that executes on a computer device to carry out a specific function or set of related functions.

Write the Java classes for the following classes. Make sure each includes 1. instance variables 2. constructor 3. copy constructor Submit a PDF with the code The classes are as follows: Character has-a name : String has-a health : integer has-a (Many) weapons : ArrayList has-a parent : Character has-a level : Level (this should not be a deep copy) Level has-a name : String has-a levelNumber : int has-a previousLevel : Level has-a nextLevel : Level Weapon has-a name : String has-a strength : double Monster is-a Character has-a angryness : double has-a weakness : Weakness Weakness has-a name : String has-a description: String has-a amount : int

Answers

Answer:

Explanation:

The following code is written in Java. It is attached as a PDF as requested and contains all of the classes, with the instance variables as described. Each variable has a setter/getter method and each class has a constructor. The image attached is a glimpse of the code.

To add musical notes or change the volume of sound, use blocks from.

i. Control ii. Sound iii. Pen​

Answers

Sorry but what’s the question?
ii-i-iii I got you.
Other Questions
Monitors manufactured by TSI Electronics have life spans that have a normal distribution with a standard deviation of 1800 hours and a mean life span of 14,000 hours. If a monitor is selected at random, find the probability that the life span of the monitor will be more than 16,141 hours. Round your answer to four decimal places. PLEASE HELP!! I'LL GIVE YOU BRAINLIEST !! FREE BRAINLIEST TO WHOEVER ANSWERS CORRECTLY!! (I CAN TELL I YOU LIE)The table represents some points on a graph of a linear function.What is the rate of change? Popularized blues and jazz vocals, along with Bessie Smith(05 Points)A. RobesonB. ToomerC. Holiday Drag and drop the range of each data set into the boxes.Data Set 1Data Set 1Data Set 2XRange:X+2.+3+4+5+6+7+81934.67Data Set 2++5+7Attivate WindowsG&o Settags to activate Windows12346 how many times would you expect to roll a number less than 3 if you roll the dice 50 times HELP ME PLAESE I HAVE ASKED THE QUESTION LIKE 5000000 TIMES AND HAVE USED UP ALL MY POINTS, THE LEAST SOMEONE CAN DO IS JUST TRY PLSSSSSWill choose brainliest :) In the diagram below, AB is parallel to CD. What is the value of y?A. 45B. 55C. 135D. 65 Which of the following is not a symptom of overtraining?O A. Decline in performanceB. Rapid weight gainC. FatigueD. Slow healing rate why are plate movements of convection important in earth science During normal cell division, a parent cell having four chromosomes in G1 will produce two daughter cells, each containing ______. Suppose you had the chance to plan a quinceaera. Do you think it would be easy or difficult? Why? Pitch perfect summary Circle the graph(s) that represent a function Select the correct answer Rita is making a beaded bracelet she has a collection of 160 blue beads 80 gray beads and 240 pink beads what is the estimated probability that Rita will need to pick at least five beads before she picks a gray bead from her collection use a table of randomly generated outcomes to answer the question each letter represents the first letter of the bead color 0.05 0.10 0.45 0.55 Accidentally clicked A, but which one is it. Kinda confused I need help plz answer!!! What ocean separates Africa and India? Which is the conjugate acid of HSO4 PLEASE HELP ASAP!!!!!