When search engines index your website, one of the first things that happens is they will look to see if a robots.txt file exists. This file is specifically designed to direct the crawler by identifying files and folders on your site that it's allowed to index and those it is not. The blocked files specified by robots.txt are essentially the opposite purpose of a sitemap (which is designed to direct the crawler to all of your pages). Use this tool to create your robots.txt file so that you can exclude indexing of certain pages and/or define the behaviour based on the crawler visiting your site.
This tool is simply to use. By default it will specify that all of the spiders from each source are allowed to index your page. However depending on the situation you can block certain crawlers from seeing certain pages. For example beyond the search engines themselves you have the ability to block various blacklist analysis and other data harvesting tools such as SEOprofiler, Majestic, SEOmoz, Ahrefs, SEMrush, WebMeUp and so on.
Having a robots.txt file in place for your site is a good idea as it gives you control over what's going on, and the ability to restrict access wherever needed.