This website stores cookies on your computer. These cookies are used to collect information about how you interact with the website and allows us to remember you. We use this information in order to improve and customize your browsing experience and for analytics and metrics about visitors to this website. If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference not to be tracked.
Blog > SEO > Google's SEO Starter Guide Part II

Google's SEO Starter Guide Part II

Since gathering the statistics for the last article the site has made another two impressions. One of them I can't take credit for as I believe it was me, as a result of following Google's documentation and testing that the site has been registered for crawling. The other impression, which wasn't me, however was not due to the changes that we put into place in the last article as the impression came before the changes were published. Again no click through but the page came out at number five in the person's search. A Quantum Leap from number 174. The changes that I document in this issue were implemented over a period of several days, with some failures, but were fully complete by 10 April 2020. I'm not sure why the number of mobile pages dropped down to 1.

Date Total Impressions Total Clicks Average Position Valid Pages Pages in Error Valid Mobile Pages Mobile Pages in Error
16 March 2020 1 0 174 31 0 2 0
25 March 2020 2 0 89.5 31 0 2 0
29 March 2020 3 0 60 31 0 2 0
11 April 2020 3 0 60 31 0 1 0

Understand how search engines use URLs

Now to look at the next section of recommendations from Google.

"Google recommends that all websites use https:// when possible"

"When adding your website to Search Console, we recommend adding both http:// and https:// versions, as well as the "www" and "non-www" versions."

Two very short sentences but there is some work to do in order to achieve it. Some amendments and additional infrastructure is required. Our infrastructure starting point can be what was created in the article Url to Somewhere, but we need to look at the Amazon and Terraform documentation. HTTPS for our static website can be achieved using AWS CloudFront and AWS Certificate Manager, ACM. As well as having a HTTPS protocol I also want to prevent the use of the HTTP URLs and I still want the subdomains and alternate domain to redirect to awebistefromscratch.com. In terms of the new Terraform components we need to look at the documentation for aws_acm_certificate to create the certificate in ACM and acm_certificate_validation to validate it. In addition to ACM we need the aws_cloudfront_distribution and then to restrict access so only the HTTPS protocol we need a aws_cloudfront_origin_access_identity resource. A non functional requirement is that I want to keep my costs as low as possible.

Changes to the Store Module.

sources/store/main.tf

terraform {
  backend "s3" {}
}

provider "aws" {}

resource "aws_cloudfront_origin_access_identity" "origin_access_identity" {
  comment = "awebsitefromscratch"
}

resource "aws_s3_bucket" "website" {
  bucket = var.main_bucket
  policy = <<POLICY
{
  "Version":"2012-10-17",
  "Statement":[
    {
       "Sid": "PolicyForCloudFrontPrivateContent",
       "Effect":"Allow",
         "Principal": {
         "AWS": "${aws_cloudfront_origin_access_identity.origin_access_identity.iam_arn}"
       },
       "Action":["s3:GetObject"],
       "Resource":["arn:aws:s3:::${var.main_bucket}/*"]
    }
  ]
}
POLICY

  website {
    index_document = "index.html"
  }
}

resource "aws_s3_bucket" "website_subdomains" {
  count = length(var.redirect_buckets)

  bucket = element(var.redirect_buckets, count.index)

  website {
    redirect_all_requests_to = var.main_bucket
  }
}

The aws_cloudfront_origin_access_identity resource has been added and this has then been used as part of the amended bucket policy. In addition to the bucket policy the Public Read ACL has been removed. These two changes make sure that you will only be able to get to the website via the CloudFront distribution. To support this change we need a new variable input and some new outputs.

sources/store/variables.tf

variable "main_bucket" {
  description = "The name of the bucket hosting the websites content"
}

variable "redirect_buckets" {
  description = "A list of buckets that will redirect to the main bucket"
  type        = list
  default     = []
}

variable "oai_comment" {
  description = "The comment to add to the OAI"
}

sources/store/outputs.tf

output "s3_bucket_dns_hosted_zone_id" {
  description = "Bucket hosted zone id for region"
  value        = aws_s3_bucket.website.hosted_zone_id
}

output "s3_bucket_regional_domain_name" {
  description = "The Buckets regional domain name"
  value       = aws_s3_bucket.website.bucket_regional_domain_name
}

output "sub_domain_s3_bucket_website_domain" {
  description = "Bucket dns name for the sub domain"
  value       = aws_s3_bucket.website_subdomains[*].website_domain
}

output "cloudfront_access_identity_path" {
  description = "The OAI for use with the CloudFront Distribution"
  value       = aws_cloudfront_origin_access_identity.origin_access_identity.cloudfront_access_identity_path
}

Changes to the Domain Module.

With the Store module complete now we look at the domain module. I did consider having the CloudFront distribution in its own module but there is so much reference to the current domain resources I decided against it. For the domain module let's start with the variables file, sources/domain/variables.tf

variable "domain_names" {
  description = "A list of domain names"
  type        = list
}

variable "s3_bucket_regional_domain_name" {
  description = "The regional domain name of the s3 bucket storing the files"
}

variable "s3_bucket_dns_hosted_zone_id" {
  description = "A hosted zone id of the s3 buckets"
}

variable "domain_s3_bucket_website_domain" {
  description = "Dns name of the domain s3 buckets to be redirected to cloud front"
}

variable "sub_domain_s3_bucket_website_domain" {
  description = "A list dns names of the sub domain s3 buckets"
  type        = list
}

variable "google_verification_code" {
  description = "The verification code for Google search"
}

variable "cloudfront_access_identity_path" {
  description = "The OAI that has access to the s3 bucket"
}

New in are the variables s3_bucket_regional_domain_name and cloudfront_access_identity_path. The variable sub_domain_s3_bucket_website_domain has been converted from a list to a string. Now for the sources/domain/main.tf file.

terraform {
  backend "s3" {}
}

provider "aws" {}

provider "aws" {
  alias  = "us_east"
  region = "us-east-1"
}

data "aws_route53_zone" "this" {
  count = length(var.domain_names)
  name  = "${element(var.domain_names, count.index)}."
}

resource "aws_acm_certificate" "cert" {

  provider = aws.us_east

  domain_name               = element(var.domain_names, 0)
  validation_method         = "DNS"
  subject_alternative_names = concat(
    list(
      "www.${element(var.domain_names, 0)}",
      element(var.domain_names, 1),
      "www.${element(var.domain_names, 1)}")
    )

  lifecycle {
    create_before_destroy = true
    ignore_changes        = [subject_alternative_names]
  }
}

resource "aws_route53_record" "cert_validation" {
  count = 2

  name    = element(aws_acm_certificate.cert.domain_validation_options.*.resource_record_name, count.index)
  type    = element(aws_acm_certificate.cert.domain_validation_options.*.resource_record_type, count.index)
  zone_id = element(data.aws_route53_zone.this[*].zone_id, 0)
  records = [element(aws_acm_certificate.cert.domain_validation_options.*.resource_record_value, count.index)]
  ttl     = 60
}

resource "aws_route53_record" "cert_validation_alt" {
  count = 2
  name    = element(aws_acm_certificate.cert.domain_validation_options.*.resource_record_name, count.index + 2)
  type    = element(aws_acm_certificate.cert.domain_validation_options.*.resource_record_type, count.index + 2)
  zone_id = element(data.aws_route53_zone.this[*].zone_id, 1)
  records = [element(aws_acm_certificate.cert.domain_validation_options.*.resource_record_value, count.index + 2)]
  ttl     = 60
}

resource "aws_acm_certificate_validation" "cert" {
  provider = aws.us_east

  certificate_arn = aws_acm_certificate.cert.arn

  validation_record_fqdns = list(
    element(aws_route53_record.cert_validation.*.fqdn, 0),
    element(aws_route53_record.cert_validation.*.fqdn, 1),
    element(aws_route53_record.cert_validation_alt.*.fqdn, 0),
    element(aws_route53_record.cert_validation_alt.*.fqdn, 1)
  )

}

resource "aws_cloudfront_distribution" "s3_distribution" {

  default_cache_behavior {
    allowed_methods  = ["GET", "HEAD"]
    cached_methods   = ["GET", "HEAD"]
    target_origin_id = "primaryS3"

    forwarded_values {
      query_string = false

      cookies {
        forward = "none"
      }
    }

    viewer_protocol_policy = "redirect-to-https"
  }

  enabled = true

  origin {
    domain_name = var.s3_bucket_regional_domain_name
    origin_id   = "primaryS3"

    s3_origin_config {
      origin_access_identity = var.cloudfront_access_identity_path
    }
  }

  restrictions {
    geo_restriction {
      restriction_type = "none"
    }
  }

  viewer_certificate {
    cloudfront_default_certificate = false
    acm_certificate_arn            = aws_acm_certificate.cert.arn
    ssl_support_method             = "sni-only"
    minimum_protocol_version       = "TLSv1.1_2016"
  }

  aliases             = [
    element(var.domain_names, 0),
    element(var.domain_names, 1),
    "www.${element(var.domain_names, 0)}",
    "www.${element(var.domain_names, 1)}"
  ]
  default_root_object = "index.html"
  price_class         = "PriceClass_100"
}

resource "aws_route53_record" "this" {

  zone_id = element(data.aws_route53_zone.this[*].zone_id, 0)
  name    = element(var.domain_names, 0)
  type    = "A"

  alias {
    name                   = aws_cloudfront_distribution.s3_distribution.domain_name
    zone_id                = aws_cloudfront_distribution.s3_distribution.hosted_zone_id
    evaluate_target_health = false
  }
}

resource "aws_route53_record" "domain" {

  zone_id = element(data.aws_route53_zone.this[*].zone_id, 1)
  name    = element(var.domain_names, 1)
  type    = "A"

  alias {
    name                   = var.domain_s3_bucket_website_domain
    zone_id                = var.s3_bucket_dns_hosted_zone_id
    evaluate_target_health = false
  }
}

resource "aws_route53_record" "sub_domains" {
  count = length(var.domain_names)

  zone_id = element(data.aws_route53_zone.this[*].zone_id, count.index)
  name    = "www.${element(var.domain_names, count.index)}"
  type    = "A"

  alias {
    name                   = element(var.sub_domain_s3_bucket_website_domain, count.index)
    zone_id                = var.s3_bucket_dns_hosted_zone_id
    evaluate_target_health = false
  }
}

resource "aws_route53_record" "google_txt" {
  name    = ""
  zone_id = element(data.aws_route53_zone.this[*].zone_id, 0)
  type    = "TXT"
  records = ["google-site-verification=${var.google_verification_code}"]
  ttl     = 300
}

This is where we have seen most of the change. Firstly we have an alias for the AWS provider. This is required for the ACM certificate, it needs to be in us-east-1. You can see the alias of the provider being used in the aws_acm_certificate.cert resource. I had to add the ignore_changes = [subject_alternative_names] to the lifecycle as I was finding with each run of plan or apply the subject_alternative_names list would be reordered and this would force the resource to be recreated. I opted for DNS validation and so we need the next two resources aws_route53_record.cert_validation for the awebsitefromscratch.com domain and aws_route53_record.cert_validation_alt for the awebsitefromscratch.co.uk domain.

The two route 53 records, the provider alias and the aws_acm_certificate resource are all required for the acm certificate validation. The aws_cloudfront_distribution resource required the most play to get right. I originally started with a custom origin, reading the AWS and Terraform documentation it seemed that this was what was required. After failing, going over the documentation again and a lot of time with Google I change this to an S3 origin. I opted for price_class = "PriceClass_100" to keep the costs down and I currently have no restrictions, but I will revisit this after some further reading and inspecting the CloudFront Distribution statistics that AWS makes available.

The final change to the domain source are the route 53 alias records. The two domains needed to be split as awebsitefromscratch.com now points at the CloudFront distribution where awebsitefromscratch.co.uk and the two subdomains still point at their respective S3 buckets.

Changes to the terragrunt.hcl files.

The file infrastructure/terraform/prod/store/terragrunt.hcl has one new addition of oai_comment

include {
    path = "${find_in_parent_folders()}"
}

inputs = {
  main_bucket      = "awebsitefromscratch.com"
  redirect_buckets = [
    "www.awebsitefromscratch.com",
    "awebsitefromscratch.co.uk",
    "www.awebsitefromscratch.co.uk",
  ]
  oai_comment      = "awebsitefromscratch"
}

and infrastructure/terraform/prod/domain/terragrunt.hcl now looks like

include {
    path = "${find_in_parent_folders()}"
}

dependency "store" {
  config_path = "../store"
}

inputs = {
  domain_names                    = [
    "awebsitefromscratch.com",
    "awebsitefromscratch.co.uk"
  ]

  domain_s3_bucket_website_domain     = dependency.store.outputs.sub_domain_s3_bucket_website_domain[0]

  sub_domain_s3_bucket_website_domain = [
    dependency.store.outputs.sub_domain_s3_bucket_website_domain[1],
    dependency.store.outputs.sub_domain_s3_bucket_website_domain[2]
  ]

  s3_bucket_dns_hosted_zone_id        = dependency.store.outputs.s3_bucket_dns_hosted_zone_id
  google_verification_code            = "kjhkjhkjah-lkdjkbfldbkjjdbfjj"
  s3_bucket_regional_domain_name      = dependency.store.outputs.s3_bucket_regional_domain_name
  cloudfront_access_identity_path     = dependency.store.outputs.cloudfront_access_identity_path
}

Middleman changes now required

That is all of the infrastructure changes complete, however CloudFront can't serve up content with pretty urls. In order to get the site working correctly we need to remove the line activate :directory_indexes from the config.rb file and then deal with the consequences. Links to images in the blogs no longer work. Image links needed to be changed from ![Vanilla Project's default output](normal-project-init_small.png) to <img src="zero-to-middleman/normal-project-init_small.png" and my article links needed to change from <a href="/blog/2020/02/20/terraform-and-terragrunt/index.html">Terraform and Terragrunt</a> to <a href="/blog/2020/02/20/terraform-and-terragrunt.html">Terraform and Terragrunt</a>.

Final Thoughts

This has been one of the more fiddly bits of infrastructure I have put together. It shows that sometimes just reading isn't enough. You need to get your hands dirty, have a play and make some mistakes.