As Easy As Jagger 1 – 2 – 3

We wanted to give a new update to those who might have not heard about this Google update. According to one of the Google inventor’s of the recent patent, “Matt Cutt’s“, this latest updated labeled Jagger, is a 3 step program. Jagger 1 is up and live and has seemed to kill a lot of webmasters natural listings. Jagger 2 is currently in the process of spreading like a virus throughout the web, with the final step of Jagger 3 starting next week.

Reading through the patent, it appears they are really going after spammy technique’s and link purchasing employed by webmasters. What we’ve noticed is some sites originally took big hits in the SERPs, however, Jagger 2 has begun bringing them back to the top. We expect after the Jagger 3 has completed, we will see more stable listings as previously shown. Here is the recent post from Matt Cutt’s blog. . .

It looks like Jagger2 is starting to be visible. GoogleGuy posted over on WebmasterWorld with what SEOs should expect:

McMohan, good eyes in spotting some changes at 66.102.9.104. I expect Jagger2 to start at 66.102.9.x. It will probably stay at 1-2 data centers for the next several days rather than spreading quickly. But that data center shows the direction that things will be moving in (bear in mind that things are fluxing, and Jagger3 will cause flux as well).

Matt Cutts posted how to send feedback on Jagger1 at http://www.mattcutts.com/blog/update-jagger-contacting-google/

If you’re looking at 66.102.9.x and have new feedback on what you see there (whether it be spam or just indexing related), please use the same mechanism as before, except use the keyword Jagger2. I believe that our webspam team has taken a first pass through the Jagger1 feedback and acted on a majority of the spam reports. The quality team may wait until Jagger3 is visible somewhere before delving into the non-spam index feedback.

If things stay on the same schedule (which I can’t promise, but I’ll keep you posted if I learn more), Jagger3 might be visible at one data center next week. Folks should have several weeks to give us feedback on Jagger3 as it gradually becomes more visible at more data centers.

Not much more I can add to that. Even on 66.102.9.104, it will still take a day or so for the changes to be fully visible at that data center. Just to re-emphasize, if you send new feedback on a data center such as 66.102.9.104, be sure to use the keyword jagger2 in spam reports or index feedback so that we can tell this is newer feedback. Jagger1, Jagger2, and Jagger3 are mostly independent changes, but they’re occurring closely enough in time (plus they interact to some degree) that it’s clearer just to act as if they were one update for feedback purposes.

I think the moral of the story is, it’s very possible your site will drop off initially, but eventually rebound . . . IF…AND ONLY IF … your site employs whitehat techniques and full of content related to your subject. Keep building your linking campaign with relevant partner sites and build pages and pages of relevant content. Make your pages or site bookmarked accessible as this is a new feature Google will watch for ranking.

Posted on

Dissection of New Google Patent

While some people may take a quick peek at the new Google patent and get quick tired head, the true geeks like to dissect the crazy talk and try to make sense of it. I figured I would put down my thoughts and perception of certain key sections of their new piece of work. Maybe it will help those who feel like their head will explode if they read anymore of this new ranking system. 🙂

Lets do all of ourselves a favor and skip the “claims” and move down to the meat of the patent application labeled Description.

[0008] Both categories of search engines strive to provide high quality results for a search query. There are several factors that may affect the quality of the results generated by a search engine. For example, some web site producers use spamming techniques to artificially inflate their rank. Also, “stale” documents (i.e., those documents that have not been updated for a period of time and, thus, contain stale data) may be ranked higher than “fresher” documents (i.e., those documents that have been more recently updated and, thus, contain more recent data). In some particular contexts, the higher ranking stale documents degrade the search results.

They are saying here that sites can get themselves ranked high with a back link from a high ranked site that is not updated often.

SUMMARY OF THE INVENTION

[0010] Systems and methods consistent with the principles of the invention may score documents based, at least in part, on history data associated with the documents. This scoring may be used to improve search results generated in connection with a search query.

[0011] According to one aspect consistent with the principles of the invention, a method for scoring a document is provided. The method may include identifying a document and obtaining one or more types of history data associated with the document. The method may further include generating a score for the document based, at least in part, on the one or more types of history data.

[0012] According to another aspect, a method for scoring documents is provided. The method may include determining an age of linkage data associated with a linked document and ranking the linked document based on a decaying function of the age of the linkage data.

In this summary, they are implying the scoring of sites will be based on historical information like the age of back links.

[0032] History component 320 may gather history data associated with the documents in document corpus 340. In implementations consistent with the principles of the invention, the history data may include data relating to: document inception dates; document content updates/changes; query analysis; link-based criteria; anchor text (e.g., the text in which a hyperlink is embedded, typically underlined or otherwise highlighted in a document); traffic; user behavior; domain-related information; ranking history; user maintained/generated data (e.g., bookmarks); unique words, bigrams, and phrases in anchor text; linkage of independent peers; and/or document topics. These different types of history data are described in additional detail below. In other implementations, the history data may include additional or different kinds of data.

This is a good section which details the many bits of data that may be included in the scoring of websites.

[0038] Search engine 125 may use the inception date of a document for scoring of the document. For example, it may be assumed that a document with a fairly recent inception date will not have a significant number of links from other documents (i.e., back links). For existing link-based scoring techniques that score based on the number of links to/from a document, this recent document may be scored lower than an older document that has a larger number of links (e.g., back links). When the inception date of the documents are considered, however, the scores of the documents may be modified (either positively or negatively) based on the documents’ inception dates.

[0039] Consider the example of a document with an inception date of yesterday that is referenced by 10 back links. This document may be scored higher by search engine 125 than a document with an inception date of 10 years ago that is referenced by 100 back links because the rate of link growth for the former is relatively higher than the latter. While a spiky rate of growth in the number of back links may be a factor used by search engine 125 to score documents, it may also signal an attempt to spam search engine 125. Accordingly, in this situation, search engine 125 may actually lower the score of a document(s) to reduce the effect of spamming.

These two sections talk of scoring a site based on the age of back links pointing to it. It also talks of scoring based on the rate of back links that are found pointing to the website. In other words, if a brand new site goes from 10 links to 10,000 links in a week, it’s obvious the website is using spamming techniques to increase it’s score and basically lower the score rather than increase it.

[0041] In one implementation, search engine 125 may modify the link-based score of a document as follows:

H=L/log(F+2),

[0042] where H may refer to the history-adjusted link score, L may refer to the link score given to the document, which can be derived using any known link scoring technique (e.g., the scoring technique described in U.S. Pat. No. 6,285,999) that assigns a score to a document based on links to/from the document, and F may refer to elapsed time measured from the inception date associated with the document (or a window within this period).

[0043] For some queries, older documents may be more favorable than newer ones. As a result, it may be beneficial to adjust the score of a document based on the difference (in age) from the average age of the result set. In other words, search engine 125 may determine the age of each of the documents in a result set (e.g., using their inception dates), determine the average age of the documents, and modify the scores of the documents (either positively or negatively) based on a difference between the documents’ age and the average age.

This illustrates a formula used to get the score. The link score appears to be related to their PageRank, although I have not gone through the related patent since it relates to several others. (Quite a web of patents!) But it appears use an existing score of a website into their new formula using history and link age giving a new score. I’m definitely not a math whiz to break this down, but I’m sure the genius’ will soon follow with their interpretations.

[0047] In one implementation, search engine 125 may generate a content update score (U) as follows:

[0048] U=f(UF, UA),

[0049] where f may refer to a function, such as a sum or weighted sum, UF may refer to an update frequency score that represents how often a document (or page) is updated, and UA may refer to an update amount score that represents how much the document (or page) has changed over time. UF may be determined in a number of ways, including as an average time between updates, the number of updates in a given time period, etc.

[0050] UA may also be determined as a function of one or more factors, such as the number of “new” or unique pages associated with a document over a period of time. Another factor might include the ratio of the number of new or unique pages associated with a document over a period of time versus the total number of pages associated with that document. Yet another factor may include the amount that the document is updated over one or more periods of time (e.g., n % of a document’s visible content may change over a period t (e.g., last m months)), which might be an average value. A further factor might include the amount that the document (or page) has changed in one or more periods of time (e.g., within the last x days).

[0051] According to one exemplary implementation, UA may be determined as a function of differently weighted portions of document content. For instance, content deemed to be unimportant if updated/changed, such as Javascript, comments, advertisements, navigational elements, boilerplate material, or date/time tags, may be given relatively little weight or even ignored altogether when determining UA. On the other hand, content deemed to be important if updated/changed (e.g., more often, more recently, more extensively, etc.), such as the title or anchor text associated with the forward links, could be given more weight than changes to other content when determining UA.

[0052] UF and UA may be used in other ways to influence the score assigned to a document. For example, the rate of change in a current time period can be compared to the rate of change in another (e.g., previous) time period to determine whether there is an acceleration or deceleration trend. Documents for which there is an increase in the rate of change might be scored higher than those documents for which there is a steady rate of change, even if that rate of change is relatively high. The amount of change may also be a factor in this scoring. For example, documents for which there is an increase in the rate of change when that amount of change is greater than some threshold might be scored higher than those documents for which there is a steady rate of change or an amount of change is less than the threshold.

This part is interesting. It says they will begin scoring the actual content found and the rate of change. There will be two parts, change on a page and change in the amount of pages being added. The key here is, they watch for trends. If you change information every day and add 10 pages of content every day, it will be normal. If another site adds information sporadically and puts a lot of information up over two days and a little over the next few days, that site’s score will be higher. And in [0052], they are all but disregarding everything but the meat of your page. This means links placed on the side or foot of the pages are probably not going to be counted much, if at all. This is a definite cry for our Billboard product! Your links are embedded into the meat of the page, which is what they do count.

[0054] According to yet another implementation, search engine 125 may store a summary or other representation of a document and monitor this information for changes. According to a further implementation, search engine 125 may generate a similarity hash (which may be used to detect near-duplication of a document) for the document and monitor it for changes. A change in a similarity hash may be considered to indicate a relatively large change in its associated document. In other implementations, yet other techniques may be used to monitor documents for changes. In situations where adequate data storage resources exist, the full documents may be stored and used to determine changes rather than some representation of the documents.

I wanted to point this one out for one reason . . . do not duplicate your content! They have mathematical equations that spot this automatically and will ding you for it. There is nothing that will help you by duplicating your content on several pages or different domains. 😉

[0058] According to an implementation consistent with the principles of the invention, one or more query-based factors may be used to generate (or alter) a score associated with a document. For example, one query-based factor may relate to the extent to which a document is selected over time when the document is included in a set of search results. In this case, search engine 125 might score documents selected relatively more often/increasingly by users higher than other documents.

[0059] Another query-based factor may relate to the occurrence of certain search terms appearing in queries over time. A particular set of search terms may increasingly appear in queries over a period of time. For example, terms relating to a “hot” topic that is gaining/has gained popularity or a breaking news event would conceivably appear frequently over a period of time. In this case, search engine 125 may score documents associated with these search terms (or queries) higher than documents not associated with these terms.

In simple terms, if people do not click your listing in results, you’ll slide down. If you are being clicked by people more often, then it will help you. And if your site deals with a hot topic then you will be scored higher than if your site is not directly related to the hot topic.

0062] Yet another query-based factor might relate to the “staleness” of documents returned as search results. The staleness of a document may be based on factors, such as document creation date, anchor growth, traffic, content change, forward/back link growth, etc. For some queries, recent documents are very important (e.g., if searching for Frequently Asked Questions (FAQ) files, the most recent version would be highly desirable). Search engine 125 may learn which queries recent changes are most important for by analyzing which documents in search results are selected by users. More specifically, search engine 125 may consider how often users favor a more recent document that is ranked lower than an older document in the search results. Additionally, if over time a particular document is included in mostly topical queries (e.g., “World Series Champions”) versus more specific queries (e.g., “New York Yankees”), then this query-based factor–by itself or with others mentioned herein–may be used to lower a score for a document that appears to be stale.

[0063] In some situations, a stale document may be considered more favorable than more recent documents. As a result, search engine 125 may consider the extent to which a document is selected over time when generating a score for the document. For example, if for a given query, users over time tend to select a lower ranked, relatively stale, document over a higher ranked, relatively recent document, this may be used by search engine 125 as an indication to adjust a score of the stale document.

[0064] Yet another query-based factor may relate to the extent to which a document appears in results for different queries. In other words, the entropy of queries for one or more documents may be monitored and used as a basis for scoring. For example, if a particular document appears as a hit for a discordant set of queries, this may (though not necessarily) be considered a signal that the document is spam, in which case search engine 125 may score the document relatively lower.

All of this is simply saying their bots will monitor the activity of users. Depending on the behavior of the users, it will score a site high based on staleness or newness. In other words, if in a subject people constantly select older sites, then it will score older sites higher in that subject. And vice versa for the newer sites.

[0067] According to an implementation consistent with the principles of the invention, one or more link-based factors may be used to generate (or alter) a score associated with a document. In one implementation, the link-based factors may relate to the dates that new links appear to a document and that existing links disappear. The appearance date of a link may be the first date that search engine 125 finds the link or the date of the document that contains the link (e.g., the date that the document was found with the link or the date that it was last updated). The disappearance date of a link may be the first date that the document containing the link either dropped the link or disappeared itself.

[0068] These dates may be determined by search engine 125 during a crawl or index update operation. Using this date as a reference, search engine 125 may then monitor the time-varying behavior of links to the document, such as when links appear or disappear, the rate at which links appear or disappear over time, how many links appear or disappear during a given time period, whether there is trend toward appearance of new links versus disappearance of existing links to the document, etc.

[0069] Using the time-varying behavior of links to (and/or from) a document, search engine 125 may score the document accordingly. For example, a downward trend in the number or rate of new links (e.g., based on a comparison of the number or rate of new links in a recent time period versus an older time period) over time could signal to search engine 125 that a document is stale, in which case search engine 125 may decrease the document’s score. Conversely, an upward trend may signal a “fresh” document (e.g., a document whose content is fresh–recently created or updated) that might be considered more relevant, depending on the particular situation and implementation.

This says they will monitor back links for trends. If the trend begins to drop links, the site could be deemed a stale site. If the trend is increasing, it would be scored higher as a freshly updated site. In other words, if you continually build links, you will be fine. If you add a lot of links and stop, then they start dropping off, not so good.

[0077] The dates that links appear can also be used to detect “spam,” where owners of documents or their colleagues create links to their own document for the purpose of boosting the score assigned by a search engine. A typical, “legitimate” document attracts back links slowly. A large spike in the quantity of back links may signal a topical phenomenon (e.g., the CDC web site may develop many links quickly after an outbreak, such as SARS), or signal attempts to spam a search engine (to obtain a higher ranking and, thus, better placement in search results) by exchanging links, purchasing links, or gaining links from documents without editorial discretion on making links. Examples of documents that give links without editorial discretion include guest books, referrer logs, and “free for all” pages that let anyone add a link to a document.

This is an important part stating that natural link building process happens slowly. So building link popularity should be a steady process occuring in small chunks. And definitely don’t waste your time on guestbooks/forums and FFA pages.

[0079] According to another implementation, the analysis may depend, not only on the age of the links to a document, but also on the dynamic-ness of the links. As such, search engine 125 may weight documents that have a different featured link each day, despite having a very fresh link, differently (e.g., lower) than documents that are consistently updated and consistently link to a given target document. In one exemplary implementation, search engine 125 may generate a score for a document based on the scores of the documents with links to the document for all versions of the documents within a window of time. Another version of this may factor a discount/decay into the integration based on the major update times of the document.

This one goes out to all of you participating in the programs like Digital Point’s coop network and also Link Vault. . . Other than possibly getting traffic, which probably isn’t that great since most people stick them in hard to find locations, these programs look like they are not going to help much anymore. They are great free programs to help people out, but of course the big girl (Google) had to put a stop to them. I believe the constant changing of text ads and url’s with every visit is a flag to them and they will either be discounted or ignored all together. Hopefully they will not penalize for them!

[0083] Alternatively, if the content of a document changes such that it differs significantly from the anchor text associated with its back links, then the domain associated with the document may have changed significantly (completely) from a previous incarnation. This may occur when a domain expires and a different party purchases the domain. Because anchor text is often considered to be part of the document to which its associated link points, the domain may show up in search results for queries that are no longer on topic. This is an undesirable result.

[0084] One way to address this problem is to estimate the date that a domain changed its focus. This may be done by determining a date when the text of a document changes significantly or when the text of the anchor text changes significantly. All links and/or anchor text prior to that date may then be ignored or discounted.

This one is a sharp blow to those who hawk over high ranked older sites when they expire. Buying existing sites will be no good unless their topic is directly related to what you plan to put on it after you attain the name. Once they realize the domain changed ownership, all previous links are discounted.

[0085] The freshness of anchor text may also be used as a factor in scoring documents. The freshness of an anchor text may be determined, for example, by the date of appearance/change of the anchor text, the date of appearance/change of the link associated with the anchor text, and/or the date of appearance/change of the document to which the associated link points. The date of appearance/change of the document pointed to by the link may be a good indicator of the freshness of the anchor text based on the theory that good anchor text may go unchanged when a document gets updated if it is still relevant and good. In order to not update an anchor text’s freshness from a minor edit of a tiny unrelated part of a document, each updated document may be tested for significant changes (e.g., changes to a large portion of the document or changes to many different portions of the document) and an anchor text’s freshness may be updated (or not updated) accordingly.

HELLO ROTATING ADS! ! ! ! 🙂 This says the LinkWorth rotating ads are great!

[0087] Traffic

[0088] According to an implementation consistent with the principles of the invention, information relating to traffic associated with a document over time may be used to generate (or alter) a score associated with the document. For example, search engine 125 may monitor the time-varying characteristics of traffic to, or other “use” of, a document by one or more users. A large reduction in traffic may indicate that a document may be stale (e.g., no longer be updated or may be superseded by another document).

[0089] In one implementation, search engine 125 may compare the average traffic for a document over the last j days (e.g., where j=30) to the average traffic during the month where the document received the most traffic, optionally adjusted for seasonal changes, or during the last k days (e.g., where k=365). Optionally, search engine 125 may identify repeating traffic patterns or perhaps a change in traffic patterns over time. It may be discovered that there are periods when a document is more or less popular (i.e., has more or less traffic), such as during the summer months, on weekends, or during some other seasonal time period. By identifying repeating traffic patterns or changes in traffic patterns, search engine 125 may appropriately adjust its scoring of the document during and outside of these periods.

[0090] Additionally, or alternatively, search engine 125 may monitor time-varying characteristics relating to “advertising traffic” for a particular document. For example, search engine 125 may monitor one or a combination of the following factors: (1) the extent to and rate at which advertisements are presented or updated by a given document over time; (2) the quality of the advertisers (e.g., a document whose advertisements refer/link to documents known to search engine 125 over time to have relatively high traffic and trust, such as amazon.com, may be given relatively more weight than those documents whose advertisements refer to low traffic/untrustworthy documents, such as a pornographic site); and (3) the extent to which the advertisements generate user traffic to the documents to which they relate (e.g., their click-through rate). Search engine 125 may use these time-varying characteristics relating to advertising traffic to score the document.

They are scoring sites based on traffic. They watch for seasonal trends and will know to score sites higher during their seasonal times.

[0093] According to an implementation consistent with the principles of the invention, information corresponding to individual or aggregate user behavior relating to a document over time may be used to generate (or alter) a score associated with the document. For example, search engine 125 may monitor the number of times that a document is selected from a set of search results and/or the amount of time one or more users spend accessing the document. Search engine 125 may then score the document based, at least in part, on this information.

[0094] If a document is returned for a certain query and over time, or within a given time window, users spend either more or less time on average on the document given the same or similar query, then this may be used as an indication that the document is fresh or stale, respectively. For example, assume that the query “Riverview swimming schedule” returns a document with the title “Riverview Swimming Schedule.” Assume further that users used to spend 30 seconds accessing it, but now every user that selects the document only spends a few seconds accessing it. Search engine 125 may use this information to determine that the document is stale (i.e., contains an outdated swimming schedule) and score the document accordingly.

Make sure you keep your users reading your info as long as possible! If people leave your site faster each time, it will go against you considering your site stale.

[0097] According to an implementation consistent with the principles of the invention, information relating to a domain associated with a document may be used to generate (or alter) a score associated with the document. For example, search engine 125 may monitor information relating to how a document is hosted within a computer network (e.g., the Internet, an intranet or other network or database of documents) and use this information to score the document.

[0098] Individuals who attempt to deceive (spam) search engines often use throwaway or “doorway” domains and attempt to obtain as much traffic as possible before being caught. Information regarding the legitimacy of the domains may be used by search engine 125 when scoring the documents associated with these domains.

[0099] Certain signals may be used to distinguish between illegitimate and legitimate domains. For example, domains can be renewed up to a period of 10 years. Valuable (legitimate) domains are often paid for several years in advance, while doorway (illegitimate) domains rarely are used for more than a year. Therefore, the date when a domain expires in the future can be used as a factor in predicting the legitimacy of a domain and, thus, the documents associated therewith.

[0100] Also, or alternatively, the domain name server (DNS) record for a domain may be monitored to predict whether a domain is legitimate. The DNS record contains details of who registered the domain, administrative and technical addresses, and the addresses of name servers (i.e., servers that resolve the domain name into an IP address). By analyzing this data over time for a domain, illegitimate domains may be identified. For instance, search engine 125 may monitor whether physically correct address information exists over a period of time, whether contact information for the domain changes relatively often, whether there is a relatively high number of changes between different name servers and hosting companies, etc. In one implementation, a list of known-bad contact information, name servers, and/or IP addresses may be identified, stored, and used in predicting the legitimacy of a domain and, thus, the documents associated therewith.

[0101] Also, or alternatively, the age, or other information, regarding a name server associated with a domain may be used to predict the legitimacy of the domain. A “good” name server may have a mix of different domains from different registrars and have a history of hosting those domains, while a “bad” name server might host mainly pornography or doorway domains, domains with commercial words (a common indicator of spam), or primarily bulk domains from a single registrar, or might be brand new. The newness of a name server might not automatically be a negative factor in determining the legitimacy of the associated domain, but in combination with other factors, such as ones described herein, it could be.

This claims they will be learning DNS information on each domain, including nameservers, ip addresses and where the domains were registered.

[0109] In addition, or alternatively, search engine 125 may monitor the ranks of documents over time to detect sudden spikes in the ranks of the documents. A spike may indicate either a topical phenomenon (e.g., a hot topic) or an attempt to spam search engine 125 by, for example, trading or purchasing links. Search engine 125 may take measures to prevent spam attempts by, for example, employing hysteresis to allow a rank to grow at a certain rate. In another implementation, the rank for a given document may be allowed a certain maximum threshold of growth over a predefined window of time. As a further measure to differentiate a document related to a topical phenomenon from a spam document, search engine 125 may consider mentions of the document in news articles, discussion groups, etc. on the theory that spam documents will not be mentioned, for example, in the news. Any or a combination of these techniques may be used to curtail spamming attempts.

This section proves that buying links based solely on high pagerank is not the way to go. The mixture will cause a more natural increase of your ranking. . . something they want to see.

[0110] It may be possible for search engine 125 to make exceptions for documents that are determined to be authoritative in some respect, such as government documents, web directories (e.g., Yahoo), and documents that have shown a relatively steady and high rank over time. For example, if an unusual spike in the number or rate of increase of links to an authoritative document occurs, then search engine 125 may consider such a document not to be spam and, thus, allow a relatively high or even no threshold for (growth of) its rank (over time).

This simply states acquiring links from big directories like Yahoo and so on is fine. A bit hypocritical, don’t you think?! You can pay for it with big companies, but you can’t pay for it from small companies. hmmm. . .

[0114] According to an implementation consistent with the principles of the invention, user maintained or generated data may be used to generate (or alter) a score associated with a document. For example, search engine 125 may monitor data maintained or generated by a user, such as “bookmarks,” “favorites,” or other types of data that may provide some indication of documents favored by, or of interest to, the user. Search engine 125 may obtain this data either directly (e.g., via a browser assistant) or indirectly (e.g., via a browser). Search engine 125 may then analyze over time a number of bookmarks/favorites to which a document is associated to determine the importance of the document.

[0115] Search engine 125 may also analyze upward and downward trends to add or remove the document (or more specifically, a path to the document) from the bookmarks/favorites lists, the rate at which the document is added to or removed from the bookmarks/favorites lists, and/or whether the document is added to, deleted from, or accessed through the bookmarks/favorites lists. If a number of users are adding a particular document to their bookmarks/favorites lists or often accessing the document through such lists over time, this may be considered an indication that the document is relatively important. On the other hand, if a number of users are decreasingly accessing a document indicated in their bookmarks/favorites list or are increasingly deleting/replacing the path to such document from their lists, this may be taken as an indication that the document is outdated, unpopular, etc. Search engine 125 may then score the documents accordingly.

[0116] In an alternative implementation, other types of user data that may indicate an increase or decrease in user interest in a particular document over time may be used by search engine 125 to score the document. For example, the “temp” or cache files associated with users could be monitored by search engine 125 to identify whether there is an increase or decrease in a document being added over time. Similarly, cookies associated with a particular document might be monitored by search engine 125 to determine whether there is an upward or downward trend in interest in the document.

Now is a good time to add the “ADD US TO YOUR BOOKMARK” options. They will consider this as a positive to your score.

[0119] According to an implementation consistent with the principles of the invention, information regarding unique words, bigrams, and phrases in anchor text may be used to generate (or alter) a score associated with a document. For example, search engine 125 may monitor web (or link) graphs and their behavior over time and use this information for scoring, spam detection, or other purposes. Naturally developed web graphs typically involve independent decisions. Synthetically generated web graphs, which are usually indicative of an intent to spam, are based on coordinated decisions, causing the profile of growth in anchor words/bigrams/phrases to likely be relatively spiky.

[0120] One reason for such spikiness may be the addition of a large number of identical anchors from many documents. Another possibility may be the addition of deliberately different anchors from a lot of documents. Search engine 125 may monitor the anchors and factor them into scoring a document to which their associated links point. For example, search engine 125 may cap the impact of suspect anchors on the score of the associated document. Alternatively, search engine 125 may use a continuous scale for the likelihood of synthetic generation and derive a multiplicative factor to scale the score for the document.

[0121] In summary, search engine 125 may generate (or alter) a score associated with a document based, at least in part, on information regarding unique words, bigrams, and phrases in anchor text associated with one or more links pointing to the document.

This just says, DON’T SPAM!. This talks about machine generated pages that really have no content but only keyword/anchor text happy pages.

[0123] According to an implementation consistent with the principles of the invention, information regarding linkage of independent peers (e.g., unrelated documents) may be used to generate (or alter) a score associated with a document.

[0124] A sudden growth in the number of apparently independent peers, incoming and/or outgoing, with a large number of links to individual documents may indicate a potentially synthetic web graph, which is an indicator of an attempt to spam. This indication may be strengthened if the growth corresponds to anchor text that is unusually coherent or discordant. This information can be used to demote the impact of such links, when used with a link-based scoring technique, either as a binary decision item (e.g., demote the score by a fixed amount) or a multiplicative factor.

This is telling us to stay relevant! Don’t plaster your text link ads all over non-relevant sites. There are definitely situations where a non-relevant site is actually relevant, but keep the majority relevant.

In conclusion, this new patent, which is said to be implemented during this most recent update, is definitely involving some big changes in scoring websites. The underlying goal of this patent is to combat the various types of techniques people use to alter their results. There is no doubt they need to work on many of these forms of spamming, but they also might be taking things to the extreme in some situations. Many of their theories do fit in many situations, however, there are also many situations where their theories go against the grain of certain markets. Although they include that their machine will adapt and “learn” the trends and patterns to eliminate these situations, there is no doubt innocent websites will be hit and hit hard for their over bearing attempts.

Now I can’t fault them for continuing to innovate and strive for the “perfect results”, but it’s almost as if they are focusing more on spammers than they are on relevancy. Ok, so let’s say they successfully eliminate all aspects of spam sites (which will NEVER happen), does this automatically give them the perfect natural listing results? The answer is very easy, which wouldn’t take 4 hours to read like this patent app, the answer is “NO”.

Finally, what does this mean to LinkWorth? “Nothing!” LinkWorth does sell text ads, but guess who else sells text links? Google. It’s a proven method of online marketing and our advertisers are wonderfully excited with the results we present to them. LinkWorth does not use spamming techniques since we match advertisers to relevant partners. If we focused on rankings only, we obviously would not stay in business long with constant changing. Soon we will launch or additional services which will include pay per click management, banner ads and a couple of very exciting new products unheard of by the public! We will change as quick, or quicker, than the search engine themselves. Staying ahead of the curve is what makes or breaks companies and we pride ourselves in staying ahead. Two of our products, Billboards and Rotating Ads, directly benefit from this latest patent addition, so we’re just fine. Sticking to the straight and narrow path will allow companies to live long and healthy lives.

Ron

Posted on

Recent Google Update

To those who have read or have noticed the big changes in Google’s most recent updated, labeled “Jagger”, the dust is beginning to settle and search results are starting to look more reasonable. Many theories have been thrown around as to what has changed with Google’s algorithm, including new PageRank, new algo calculations, discounting this, discounting that . . . but as usual, they are complete guess-timations because no one really knows what is happening except those under the Google roof themselves. Even the update name’s, like “Jagger”, are complete fabrications created by someone over at a popular forum. Google has no involvement with names to the best of our knowledge. At least not any names that are given to the public.

Many people took big hits with this recent update and while it is only natural to knee jerk if you drop off the face of the earth, we have been able to talk everyone off the ledge and wait around for things to settle. Google is obviously revising their quest for natural search results, which is a great thing, but in the process, things can definitely change. If you have no life and want to read some really boring techno mumbo jumbo, take a read of Google’s recent patent submission for more info. Some say it could be blue smoke being blown to disguise the true workings of their brain, but who really knows?! The bottom line is, they are constantly innovating new ideas to present the best results possible.

Some may ask, “Well why would you want their results to change?” That’s easy . . . I want only reliable results. The bottom line is, if you have a great site, build it so you like it, other people like it and it’s a constant venue for new information about your subject, you will be rewarded in the search results. LinkWorth is not here to spam or trick search engines, we’re here to help companies build their visibility, name awareness and their popularity. The best formula for anyone promoting their website would be a combination of text link advertising through LinkWorth and putting together a PPC (pay per click) campaign through Google and/or Yahoo. LinkWorth will allow your visibility to be built up through relevant or similar sites as yours and PPC will help get the instant traffic one may need.

So please take the Google update as a good change for all. Sure, it will take more work for some of you, but if they made things a breeze, it would make any market impossible unless you have millions of bucks in the bank to out pay your competition. Google makes the playing field fair and competitive for big to small companies. We’ll always be here to help through the tough times and the easy times.

Posted on

SES Conference & Expo

The upcoming Search Engine Strategies Conference & Expo occuring on December 5-8, will be held in Chicago, Illinois at the Hilton on South Michigan Avenue. For those unfamiliar with this event, it is a 4 day expo focused around search engine marketing and optimization. All of the big players are usually in attendance, including the search engine’s themselves. This event travels the world every year and it is a great place to put faces to organizations.

There will be a couple of LinkWorth representatives attending this expo. We will not have a exhibitor stand for this particular event, but we will be meeting many clients and/or partners who are also attending. If you are going to be attending this event and would like to possibly try to meet us, please send a message to us and provide a little info. Right now we have several meetings, but we always like to meet our customers when we can, so let us know and we’ll try to make a good time to meet.

If you are not going to be attending this expo, we do plan on having exhibitor booth’s in future SES expo’s. If we miss you during this particular event, hopefully we will get to meet you at a future expo.

For more information on this expo, please visit http://www.jupiterevents.com/sew/fall05/index.html

Posted on

New LinkWorth Location

With the continued success and the future plans of LinkWorth, the need for additional staff and additional space was important in taking our next big steps. We are happy to annouce that LinkWorth will be moving into our brand new office just north of Dallas, in Lewisville, Texas at the end of the current month (October). For our local customers, please call ahead for directions if needed. All other customers nationally and internationally, the reason for this announcement is due to the changes it will have on all customers.

Shortly after we are moved in, the massive hiring frenzy will begin. The majority of our new employees will be sales staff that will actively seek new advertisers. We will also add additional personnel to assist in support, perform daily tasks, additional programming help and assistance in payouts. This will not only help business for all partners, but it will allow LinkWorth to grow. We have been limited with space and personnel due to our existing office, so our new location will definitely benefit everyone.

During the move there will not be any downtime to services and it should be seamless. Of course, anything can happen, but we have anticipated all options and are ready to move.

Posted on

The Roots of Text Linking

If someone was to ask you to define a text link, how would you visualize it? How would you explain it to someone that might not understand it? How do you think it might have changed from when the first text link was created up to a text link created today? These are great questions to dissect and understand the true roots of text links and the effectiveness of them.

One of the biggest conversations within the Seach Engine Optimization / Search Engine world is the concept of text link advertising and if it’s something looked down upon by the search giants. Let us take the biggest of all search portals, Google, and look at their view. I recently read a blog from an actual Google employee who gave his thoughts, which he claimed was also the Google view on the topic and he openly said the buying and selling of text links is against their policies. Of course, if you read their guidelines, nowhere does it say anything about the buying and selling of text links. Now let us backup just a bit and think about how Google is where it is today. What has made them the powerhouse they are today in the search world? The answer is very easy. . .the selling of text link ads. That product alone accounts for an overwhelming majority of their quarterly earnings.

Before we get too much further into this subject, lets get back to our questions above. I asked how you would visualize a text link and how you would explain it. Since I can’t ask all of you your answers and post them as I’m typing, it only makes sense to give my answers. So the way I visualize a text link is:

When I think of a text link, I think of a pathway; a direct step into another space. Sometimes I get very futuristic and think of the old TV show Star Trek. Remember the ole saying beam me up Scottie? A text link is very similar because it beams the user to another space, on another server, in a complete different geographic location.

Now when I think of how to describe it to someone that might not understand it, it would be:

A text link is a word, or group of words, which can be clicked and direct a web surfer to another location which will provide information related to that word or group of words. It is a computer reference to another bit of information a web surfer has the option to click and read.

The last question deals with the evolution of a text link. I ask that anyone illustrate the difference between today’s text links and the text links used when the internet was first born. There is no difference. Text links are the foundation of which the internet was built on. Without text links, there would be no internet. Without text links, there would be no internet users. They are what makes the internet “user friendly” and allows surfers the ease of navigating between one location and another to read information they choose to read without someone forcing them to do so. It is also what online advertising was built on. Show me one online advertising model that doesn’t involve a text link and I’ll show you a very ineffective online advertising model.

This brings us to the early discussion of how these search giants deem the buying and selling of text links “against their guidelines”. First of all, these text links are not their property to say how and when they are to be used. When the selling of text links is what has made them who they are today, how could they be so hypocritical in saying that another entity cannot do what they do? This blog that was read made comments about “filters being applied” to look for these sellers and not count their outgoing links. It was also mentioned about applying penalties to those selling. My response to this is, if they can live with themselves and drive their value and technology into the ground by hitting the wrong sites, then I welcome this new move. The selling of links being found automatically means their technology has to identify “intent”, which is an abstract identifier. Think of how many websites that are not selling text links will get hit with this crap because of a wrong guess.

What the search engine “gods” need to realize is, whatever product they distribute, the rest of the world will continually dissect and try to adapt. Right now, text links are the basic form of what search engine algorithms are calculated with. In my honest opinion, it’s the best calculation besides original content that any search algorithm can use to determine which site is worthy; at least the only “fair” way to do so. Sure there will be those who figure it out, but it leaves the true results out of human hands. If Google was to hire a department of several hundred thousand people to manually review and score a website, then you would fall into the hands of favoritism.

So the moral of this blog story is text links are here to stay. Search engines can say all they want to try and prevent the buying and selling of text links, but the bottom line is, it is ADVERTISING. The same advertising they sell themselves. If search engines decide to issue filters or penalties, not only will they drive their company into the ground by pissing off stockholders, but the buying and selling of text link ads will continue to grow and prosper. Online commerce is bigger than search engines, they just happen to have the bigger methods in which one could find what they’re looking for. Right now, somewhere out there, the next empire is being put together that will make the Google’s and Yahoo’s and MSN’s of the world obsolete.

Long live text link advertising!

Posted on

The Little BLUE Engine MSN Search

I’m sure anyone that has searched around through SEO firms or SEO wannabe websites; they always have their “portfolio” of websites they have pushed to the top of natural listings. What I find rather funny is how almost all of them show MSN results as their portfolio!! :))

I know MSN is trying so hard and all the while saying the same ole words the little red caboose once said, “I think I can, I think I can, I think I can…” but no matter how many dollars that “poor poor” man Bill Gates has, I don’t think he’ll ever catch the likes of Google.

Most have noticed the simplicity of MSN results, much like our “SEO Experts” out there, but has anyone noticed how volatile the results are? As you might expect, we have managed some monstrous linking campaigns behind the walls of LinkWorth and we have clients who are #1, 2 or 3 on damn near all of their terms with Google and Yahoo. The weird thing is, MSN was the first engine to shoot them to the top spot, however, now these people dominating their market on all engines fine themselves struggling to make the first page with MSN now.

Obviously these are signs of little Billy “bad ass” Gates trying everything possible to get the edge on the Google Nation who have risen to power over the rest of the world. Making tweaks here and tweaks there, adjustments here and adjustments there…there has to be a way to beat those Googuys. So let’s say Bill finally finds a smooth working algorithm to fit the internet searchers needs. He gets the MSN search engine purring like a e-kitten and search results are weeding out spammers with ease and allowing relevant sites built well to list high as they should be…what would be missing?

The answer to this question is simple…. PEOPLE!! As most of you have probably noticed over the past year or two there is a generic term everyone is using these days when they want to find something on the net….”GOOGLE”.

“I GOOGLE’d my name and found this.”

“Have you tried to GOOGLE that product?”

“Just GOOGLE the product name and look for one.”

Google has become the staple of search; in technology and within our own language. So before anyone has the remote chance of knocking Google off of the pedestal, they not only have to out code their brains, but they also have to devise a catchy phrase or name for their search that people will learn to say when referencing to search using their site. And not only learn, but enjoy saying. People like to say catchy phrases or names that are easy to get out.

Let’s do a little test here and see which one of these just do not belong…

1. “Hey buddy, will you get on the net and Google the closest Pizza Hut?”

2. “Hey buddy, will you get on the net and Yahoo the next movie?”

3. “Hey dear, will you MSN the driving directions for our trip?”

Out of these 3 search giants, there is one that flows naturally, one that flows pretty natural, but you don’t hear much, then there is one that you never hear and it definitely doesn’t flow naturally. Which do you think?

So in conclusion to this story, MSN not only has a long way to go with their technology involved with MSN search, but they really need to come up with a creative name to identify their search technology if becoming number 1 in search will ever happen.

Posted on

LinkWorth.com Facelift

After many tests and many failed attempts of putting a new look to LinkWorth.com, we finally have the much needed facelift! Page and linking structure has definitely changed quite a bit, but it was something we desperately needed. Our previous site was extremely tough to grow at the fast pace in which our company is. Our new design was setup to grow progressively as we do.

We have many additions that will soon be available on the new site, so continue to check back. We have many new solutions for advertisers and/or partners, so these will be available very soon. We hope you enjoy the new look and feel as it better represents our company.

Let us know your thoughts.

Posted on

The Disappearance of PageRank

As of today, May 28, 2005, the Google PageRank has been grey’d out of everyone’s toolbar. Is this just a glitch? Is it a huge update? Is it a move to something new? Or is this the disappearance of PageRank to you and I?

Personally, I believe it will be back anytime, but what if this is a turn towards PageRank being invisible? How does this effect webmasters and most importantly, how does this effect LinkWorth?

LinkWorth has been built around “link popularity”, not PageRank. The reason we do not base prices of links and the way our system works off of PageRank is for reasons just like we’re experiencing today. PageRank can disappear at any time without notice. If this happens and LinkWorth is built specifically around PageRank, then “goodbye LinkWorth”! This would not be a smart business decision, therefore, we focused on something that is the base of EVERY search engine…LINK POPULARITY.

So if PageRank disappeared today, LinkWorth would only benefit. Others in our market who sell text links based on the PageRank of that site will suffer tremendously. While this would help our business, I personally would hate to see that happen to any of those guys. It might just put them completely out of business and that is not good, no matter how you view it. I can only wish the best for those companies.

What does this mean to individual webmasters? Nothing really. Google isn’t going away, they’re just making changes (possibly) to how their search results are displayed and ranked. As long as you are building your link popularity with great relevant inbound links, you are fine.

Posted on

Announcing the anticipated LinkWorth Toolbar

Well after a while of tweaking, changing, adding, removing, scratching and waiting…we are proud to announce the new LinkWorth Toolbar 1.0! Hats off to Tom for putting his hard elbow grease into this project and it’s really turned out great. It’s a free tool that is very beneficial to anyone buying links, reciprocating links and most of all, to LinkWorth customers. Now the tool is available to LinkWorth customers AND non LinkWorth customers. And you have our 100% guarantee that there is no hidden programs that come with the toolbar. There is nothing beneficial to LinkWorth by way of using this toolbar, other than seeing our toolbar on your browser. No adware. No spyware. No hidden programs. No sponsored search results.

Just as a SEO tool, it speaks for itself. It currently will gather valuable bits of information on any website you visit. Just click the “Get Stats!” link and it will pull up the information and plug it into the toolbar for you. It’s a great way to see if a possible linking partner meets your satisfaction. You would want to know their overall link popularity based on the major search engines. Our toolbar will not only show you their backward link numbers and the number of pages indexed, but it also adds the numbers up to provide a value we call Linkrank. Not to be confused with Google’s PageRank which is a much more complex calculation that focuses on a particular page, not a website as a whole. Our Linkrank value is a scale ranging from 1 through 7, with 7 being the most popular. Our goal of the Linkrank is not to offer up a elaborate ranking system like the all mighty PageRank, but to help organize a given website’s overall popularity, if you’re looking for a quality link building service you can contact the right experts. The Linkrank value also includes more search engines not listed on the toolbar, so for those who are familiar with our Linkrank chart, we want to make sure it causes no confusion. 😉

The value of this toolbar to actual LinkWorth customers is wonderful. Partners who want to get notified quickly of a new pending link or an unread message in their account, the toolbar will do this. There are changing icons that alert you when a new item is in your account in which the icon can be clicked to access the LinkWorth login. Advertisers, there is also the unread messages icon alerting you of new messages in your account. If you have a combo account it will serve as both. Most important, it allows advertisers the ability to see valuable partner stats when searching through our database. Sometimes our stats window can be time consuming to open, so the toolbar will pull several important stats to help in decision making. We already have several additions to the toolbar for the future version that will put your account details in your toolbar, along with additional tools to run on website domains. We have created a new forum category specifically for the LinkWorth Toolbar, so any questions, comments or suggestions, please direct them to the forum.

Visit the LinkWorth Toolbar page now!

Posted on