Miscellaneous web tools
Robots.txt Generator
The robots.txt file is the first thing search engine "spiders" look for when indexing a website. The robots.txt file tells search engine spiders (robots) which files and/or directories they are NOT allowed to index. This helps to prevent incomplete site indexing as well as prevent exposing the files and directories that you don't want "all over the world wide web". You may disallow, for example, "Google Images" from indexing certain directories and pages on your site, but not block "Google" itself from indexing those same files. INSTRUCTIONS (please read carefully): 1. Select a user agent (robot) in the left column. The default is "ALL User Agents", then click on the "ADD USER AGENT" button. 2. Type in directories and/or web pages that you do not want indexed, then click on the "ADD" button. Make sure there is a forward "slash" in-between directories and pages, and at the end of directories (indicated in red below). Repeat this step until you have added all of the files you want under the "User Agent". If you don't want a user agent to index ANY of your site, LEAVE THE TEXTFIELD BLANK and click on the "ADD" button. A "/" will appear after "Disallow:" in the code below. 3. Once step 2 is complete, you may repeat steps 1 & 2 again until you have added all of the "User Agents" you want, and files under each that you want to "Disallow". 4. When complete, click on "COPY CODE" to save the robots.txt code.
| ||
| Select a User Agent | Add directories/files you DO NOT want indexed | |
| Directory/Page.html > | | |
| Three Examples: 1: | folderName/myPage.html | |
| 2: | folderName/ | |
| 3: | myPage.html | |
Copy and paste the code into a text file and name it "robots.txt", then upload it into your root directory where your "index" page is. | ||