Skip to content

How to setup Cloudflare Crawler Hints

how to setup Crawler Hints

cloudflare Crawler Hints page rank

Stop bad bots traffic with cloudflare Crawler Hints

what is cloudflare Crawler Hints

How to setup Cloudflare Crawler Hints properly with your website the Cloudflare is always there to help website owners to rescue and speed up their web projects but this time they go beyond and directly calibrate with search engines to make better internet by engaging WEB crawlers this is important because in simple words 60 to 80% web crawlers traffic is useless for websites and search engines so at the both ends wastage of resources and internet so directly Cloudflare is now helping to make internet greener and energy-efficient more details about crawler hints below –

Will Cloudflare Crawler Hints effects page ranks ?

lots of people asking after using Cloudflare Crawler Hints if there will be a page rank or less traffic issue the answer is NO – its absolutely safe algorithm defined by Cloudflare that just simply filter out bad bots and work for good bots always whenever your site needs them to crawl like if there is no content update on any specific page and crawlers and constantly ping that webpage that’s not cool for both ends makes sense so whenever there will be an update on content then Cloudflare Crawler Hints will come in place and let the major search engines to start the crawling that’s it more on Cloudflare Crawler Hints talk was ongoing at this great topic below

Enable Crawler Hints for my website

  1. Sign in to your Cloudflare Account.
  2. In the dashboard, navigate to the Cache tab.
  3. Click on the Configuration section.
  4. Locate the Crawler Hints sign up card and enable
Crawler Hints tab in cloudflare

How Cloudflare Reducing Environmental Impact Of Web Searches

That Web talk is From Cloudflare,s IMpact week discusion about Crawler Hints in-depth

we’re here today to talk to you about what we’re doing to reduce the environmental impact of search engine crawling. Before we dive in on that topic, I want to talk about impact week a little bit. Matthew, this is called first impact week. Where did the idea come from? What’s the goal from the week for the week? And what would you like folks that sort of takeaway from the week when all is said and done? So, you know, I think that we’ve done these innovation weeks, they were born out of you actually, the very first year that we that after we launched, we decided right around the time, that was the anniversary of our launch date, which is the end of September, September 27, that we would announce a new product that we wouldn’t make any money off of, but we thought was a good thing for the internet. And that first product was the ipv6 gateway that allowed ipv6 support across our network. And, and it was, it was just really, it was, it was great for the team. You know, it felt like it was the right thing to do for the internet. And it felt like something that really resonated across the market. And, and, you know, getting more ipv6 support is just a percent good thing, globally. And so that was something that we really wanted to work on. And I think that that inspired us over the years, every year on our birthday, to try and do something that gave back to the internet, that wasn’t a product we charge for but was something that was just, you know, we thought per se good, I think the biggest one and the one that we use. So think about some of the smartest things that we did at the company, some of the things I’m the proudest of was that in 2014, we made TLS. So HTTPS is free for all of our customers, it was super scary at the time because it was the one thing that used to be the one thing that was the difference between our free pant plan and our paid plan was one-supported encryption and the other didn’t. But we looked out and said, you know, on the right, we want to be on the right side of history here. Obviously, a better internet is an encrypted internet. And so if our mission is to help build a better internet, then, of course, we should, we should do that. And it was a huge technical challenge and a huge lift. But that moment that we sort of pushed the button, and encrypted, a huge portion of the web, literally doubled the size of the encrypted web in 24 hours. Um, you know, it’s just one of my fondest memories at the company. And so I think that, that, that that was it was such a great motivating thing that we were that we are always looking for, you know, how could we do other weeks that would drive us to take kind of innovative, interesting risks. And I think a lot of companies use, like their user conference or whatever to be sort of a deadline, that stuff has to get done by this because we’re going to present it at the conference. And I think we’ve deconstructed that to some extent and said, How can we use these sort of weeks over the course of the year in order to, you know, push the launching of various features. And so we did security, we’ve done a developer week, you know, we’ve got some more coming up. But at some point, Alyssa, who runs our public policy team, so that, you know, we should really feature some of the good that we’re doing package it up, because a lot of both our customers and our investors are interested in what’s known as ESG. So enterprise, social, good, or, excuse me, environment. Social and also governance, I guess, it’s, it’s actually what that sense what I was thinking was the other way around. Um, and, and as we started talking about it, we thought, well, maybe this can be a week, but we inherently innovation weeks at Cloudflare mean launching new products. And so we looked out at a number of the things that we were doing and asked ourselves, you know, what are some of the projects that are just per se good, across the US, and how can that come together? And it was, like all these, it’s often a race, right? The last-minute Alex was, you know, dealing with that sort of 10 days before, there’s always a moment of sheer panic. But it comes together well, and we’re coming up on tomorrow will be the last day, of impact week. And it’s just it’s a list of things that I think our whole team is incredibly proud to have worked on.

I think just speaking personally as a citizen on the planet, and then speaking professionally as a product manager at Cloudflare. One of the cool sorts of intersections here is that We could have just said, Oh, we’re gonna buy some carbon offsets, we’re gonna do the sort of checklist II things that big companies go out and do. But we sort of sat down, but I think you kept on and thought about what products can we build? How can we be innovative to solve real problems here? Not just, and I think, you know, sometimes and you know, we we’re not immune to this. But you know, sometimes when companies do things, they don’t feel core to what it is that they’re doing. And I think what’s been good about impact week is, is it very much isn’t just sort of like we, we decided to do these check the box things we said, how can we use our technology and our network, and our people in order to make the internet better? And that’s exactly you know, where crawler hunts came from? Cool, yeah, let’s sort of switch gears and talk about college for a second. I know, since that first birthday, for CloudFlare, sort of when we came into the world, interacting with crawlers and helping customers to manage their interactions with car crawlers has been very close to the top of, of the sort of jobs we have to do, and how what was the sort of initial products sort of focused on helping customers manage crawler interactions and have those matured over time or changed over time? Yeah, I mean, at some level Cloudflare was inspired because the problem problems that crawlers create Lee Holloway, who is one of the three co-founders of CloudFlare, and I had worked on something called Project honeypot, which basically an open-source project that would track kind of bad actors online, still around at approximately And I mean, it’s a, it’s a testament to what a bad coder I am. That, you know, if you suffer on those pages, they’re all hyper hyper hyper annulling, interlinked, but they’re all very database-driven, there was no caching layer, and what would happen is, you know, Google would come a Google crawler would come along and find all these links and be like, Wow, this is amazing content, and crawl it incredibly, and our, and our back end server would crash all the time. And so I think, you know, almost as much as, you know, understanding that, that that, that there are all these sorts of malicious crawlers that were out there that were harvesting email addresses or, or searching for vulnerabilities, which project honeypot was specifically studying. And we also saw that there were good crawlers that would impose, you know, a substantial load on the internet. And what’s actually interesting is that as a percentage of traffic, search crawlers have decreased over that period of time. And that’s, that’s not because the total volume is has gone down. It’s because there’s just been so much more malicious crawling, and so much more internet usage generally. But it’s still about 5% of all internet traffic globally. And, and it puts a lot of loads, especially on smaller and smaller customers. And so, you know, we early on, you know, we’re always sort of saying how can we make it so that if a page is static, if a page doesn’t have to be database driven, if it doesn’t have to be driven from, you know, WordPress instance, or whatever? How can we make that available and make sure that it’s incredibly fast? And I think, especially as Google and other search engines started to use performance, as one of the key metrics for ranking, I think that really helped us focus on how do we make the performance for, you know, whenever Google is crawling the web as fast as possible. And, and so I think we’ve, we’ve always sort of had in the back of our mind, that, you know, helping our customers work with crawlers both to decrease their load, but then also to help them have better, you know, SEO, search engine optimization, were some of the sorts of values that we probably don’t talk about a ton at Cloudflare. But we’re, you know, very much part of the value proposition from our earliest days. Yeah, no, that makes a lot of sense. Alex, for those of our viewers that haven’t read the blog, or aren’t familiar, how does crawler hence, saying we announced sort of do the things that he talks about how help crawlers and help our customers and help the internet in general?

Yeah. So I think that’s a really good question. It’s sort of going back to Matthews point, one of the things that a lot of the automated traffic that exists on the web sort of does in terms of like, I guess, good bots, which is, I think, an important distinction to draw here. So automated traffic that performs some sort of useful function. So for talking about crawlers that could be going to your web page and indexing it such that you show up in a search engine, and that you’re indexed in some way that’s useful so that your website can be found by people doing that. Do you Use that search engine to find specific types of content or social media aggregators pricing sort of bought traffic? But what we were sort of interested in with all this traffic was the efficiency of it, how frequently do they go to a web page? And do they find something new, that they haven’t seen already, that could help better inform the purpose of the crawling? So if you’re a search engine crawler, you’re going to a website and you’re getting indexed. And then you come back a little bit later. And if that website hasn’t changed, you already have that content, there’s no reason to keep coming back sort of over and over again if nothing’s changed. So we were really focused on trying to answer the question as to whether or not these search engines were sort of doing redundant crawls sort of the same thing that Matthew had seen, you know, are these over and over again crawls, where they are, were they efficiently used or not. And so we sort of diving into the data, we found that around 50% of the crawls, didn’t really find anything new, they were just sort of hitting the web pages behind Cloudflare just sort of randomly or naively, or they thought that they may find something new, which I guess is not really, I guess, a point of blame or accusation or anything against these crawlers. They’re trying to do like trying to find the newest content, the freshest content so that they can accomplish their tasks of being sort of useful to the user. But we thought that if we were going to, you know, work with some of these large bot developers or search engine providers and help give them an additional heuristic, that we could actually look to see when content on our edge has changed, and provide that information, such that, you know, these, going back to check to see if anything has changed could sort of be diminished. And they could add our hint, our crawler hint, I guess, as an additional point that could help inform the cadence of crawls to these websites.

Cool. So just to sort of simplistically explain it for websites that use Cloudflare that opt into this thing, we will help transition, crawling from a poll-based or sort of polling-based operation to one that happens on a push basis, right, like, one we know something has changed, we will let the major crawlers know. And they can be much more efficient and sort of just stingy, almost in how they spend their calls. Right. Is Is that accurate?

Yeah, that’s accurate. That’s sort of providing them additional, additional information. Besides, in the past, maybe that this sort of page, we’ve seen changes at this cadence or something, which may not be representative of when things are actually changing now or in the future. And so providing them an additional piece of data that they can make a little bit smarter decisions was sort of the goal there.

I always cringe when I hear like the Win-Win, win-win-win term, but this really feels like that, right? We the search engines get fresher content than they would if they were pulling constantly they our customers and websites get less crawler traffic. And the Internet is a little greener, as a result. What are there any downsides to this thing? Like what’s the catch?

I mean, nothing really, that that sort of comes into mind that Cloudflare sort of sits in this privileged position between, you know, a lot of these client requests and a lot of origin servers that without Cloudflare would field a vast majority of these requests. And so because of where we sit sort of in that in the internet’s like architecture, I guess, we can really do some good here. And so it’s really a good opportunity for us to use some of the information that we have to help reduce general sort of carbon emissions associated with these redundant crawls, we can, you know, repurpose these servers that would otherwise be fielding these sort of, you know, excessive calls or whatever, to do something else to, you know, field other types of traffic to do something more useful.

I thought you would have to the downside is that your engineering team has to do a whole bunch of high-end engineering work, but Well,

Unknown Speaker 21:31
it’s actually a really cool little snapshot into how product development here works, right? We start with one idea, we learn a bunch of stuff, and then we say, how can we make this bigger and make this cooler, right? And then none of these things happened overnight, but are all sort of built on, on years of trying lots of things. And a lot of these things don’t work, right? This happens to be a good, good couple of successes in a row. In the context of carbon emissions and climate change, we’re obviously looking at a couple of big, big questions, as as a, you know, society, and then as specific to focus on the internet specifically to the internet, right? How can the internet be more sustainable? And like, whose job is it to make sure that we go in the right direction there? I think the good news is that, that everyone’s interests are aligned,

Unknown Speaker 22:26
we have to pay for the power that powers our servers around the world. We want to deliver content from as close as possible to waste as little as little power as possible. And we’re not alone. You know, Google and Facebook, and Microsoft invest enormous resources to figure out how to run processing as efficiently as possible across their networks. And, and so at Cloudflare, we’ve been just obsessed with how can we be as power-efficient as possible? For a long time, I think one of the frustrations that I’ve had has been that for a long time that the chip industry, at least in the server chips, hasn’t. They sort of talked like they care about power optimization, but it doesn’t feel like it has been a big priority. And I remember going up to the Intel research center back in early 2012. And knew we were a tiny company at the time. And so it was amazing. They were even inviting us there. But to their credit, I think they saw that we had a lot of potentials. And I remember sitting with a bunch of their senior chip team and saying, you know, we care about cores per watt, we want to have as many cores as possible, with a low of wattage as possible. And they spent the entire time trying to convince me that actually wasn’t what we needed. And then in servers, you know, power wasn’t power efficiency wasn’t something that that was as much of a priority. And it was honestly, it was just a disappointing conversation, not only because of the fact that you know, we were obsessed over how do we make our services as efficient as possible. But if you thought about how that conversation played out across millions of server chips, it was just an enormous amount of just loss and energy that was being consumed. And so I think what’s changed in the last little bit is, first of all, with the move to, you know, first laptops and now you know, mobile devices, there’s been so much more focus on power efficiency, in those battery power devices. What we’re now seeing is that that efficiency is making its way kind of the other direction. It started with mobile phones. You saw what Apple did, in terms of moving their laptops to To-arm-based processors when my wife has an ARM-based laptop, and the batteries last forever, and it’s and it’s amazing. And so, you know, we’ve done a lot of the hard work to make sure that we could port our stack over to run our software on more efficient ARM-based chips in the servers when we figured they would inevitably come and for a while, we were pretty excited about that the Qualcomm arm servers and the opportunity to have unfortunately Qualcomm, right as we were about to deploy them shut that division down. But we’ve been talking with and working with nearly everyone who’s working on the ARM-based space and the new impure 128. core ARM chips, which we now have in production, in Cloudflare, are really, you know, potentially a game-changer, not just because they can replicate the performance of what we got with an x86 design, but they can do it at approximately half the power consumption. And that’s great for us in terms of it means that we can go into more locations, it means that we can put more servers per rack it means that we you know, our power bill is less. But I think it’s also incredibly important for the internet. And so you know, what I hope is that this is a wake-up call for you know, the likes of Intel and AMD. Because I think more and more companies are going to start to choose what is better for the environment, but then also what is better for their power bill and for their and for their overall business. And I think that now that you’ve got places where you can get the same performance for half the power usage, that that that’s just an incredible opportunity for us to make the internet as a whole much more efficient. Yeah, I think that’s important and obvious, but maybe not always appreciated the point. And like any business like ours, reducing our carbon footprint and energy consumption, and all that is is just good business, right? It’s not? Obviously, it’s the right thing to do for the plant. But it also makes sense for our bottom line,

Unknown Speaker 27:11
which is a good, good, yeah, well alive. Yeah, it’s interesting to see like with a product like workers, you know, what the magic of workers is that, you know, if you write code as a developer, it runs on every server throughout our entire network, potentially, and you don’t have to spin up instances, you don’t have to handle any of that. And so we can distribute it everywhere. That also means is that we’re able to run our servers at a much higher utilization rate than typical cloud providers. And so just the mere fact of using our platform, it typically is significantly greener than what you see from other more traditional computing platforms. And that’s not you know, it’s not because we’ve invented new chips, or anything else, it literally is just because we can, we can get more utilization out of our network by moving that traffic around. And so that’s just one of those consequences, whereby being able to get more out of our existing gear, not only is that again, aligned with our business interests, not only does that mean that our customers can literally scale up and down to whatever capacity they need, but at the same time, that we can make sure that it is in as environmentally friendly as possible. And, you know, one of the other announcements from impact week was green workers, which allows developers to pick where their workloads are running, not just to be as efficient as Cloudflare is normally but even more efficient over time. And I think that we’re the first of the major public clouds to allow kind of a green option with computing. And again, I think that’s something that our entire team is really proud of. And we’ve been amazed at how many people have already reached out wanting to take advantage of them. Yeah, that makes sense. We’ve got a minute left, what’s on deck for impact week 2022. I think we’ll always be thinking about how we can help the environment, which is something we’ve talked about a lot, I think the other piece that is just part of, of what we’ve done at Cloudflare. All the time is thinking about how we can help civil society organizations, journalists, governments around the world. They’re trying to administer elections. And so I think we’re always looking for ways that we can use our network in order to make make the internet a better place and in turn, make the world a better place. And so I’m excited for what it is and we should start working on it as soon as possible. An exciting time for the Internet and for Cloudflare. Cool, thank you both for joining, and see you on call for dBc thanks awesome.

Leave a Reply