Re: [squid-users] Large-scale Reverse Proxy for serving images FAST

From: David Tosoff <dtosoff_at_yahoo.com>
Date: Tue, 17 Mar 2009 11:32:50 -0700 (PDT)

OK. Thanks Amos.

Changing up the icp_port to a unique for each instance worked. I should have thought about that as all instances were on the same host (localhost/127.0.0.1) w/ the same port... duhh.

So, I have a few other questions then: We're going to scale this up to a single-machine single-instance of 64 linux and 64 squid 3.0 --
 - What OS would you personally recommend running Squid 3.x on for best performance?
 - Is there no limit to the cache_mem we can use in squid 3? I'd be working with about 64GB of memory in this machine.
 - Can you elaborate on "heap replacement/garbage policy"??
 - Any other options to watch for, for optimizing memory cache usage?

Thanks again!

David

--- On Tue, 3/17/09, Amos Jeffries <squid3_at_treenet.co.nz> wrote:

> From: Amos Jeffries <squid3_at_treenet.co.nz>
> Subject: Re: [squid-users] Large-scale Reverse Proxy for serving images FAST
> To: dtosoff_at_yahoo.com
> Cc: squid-users_at_squid-cache.org
> Received: Tuesday, March 17, 2009, 12:10 AM
> David Tosoff wrote:
> > All,
> >
> > I'm new to Squid and I have been given the task of
> optimizing the delivery of photos from our website. We have
> 1 main active image server which serves up the images to the
> end user via 2 chained CDNs. We want to drop the middle CDN
> as it's not performing well and is a waste of money; in
> it's stead we plan to place a few reverse proxy web
> accelerators between the primary CDN and our image server.
> >
>
> You are aware then that a few reverse-proxy accelerators
> are in fact the definition of a CDN? So you are building
> your own instead of paying for one.
>
> Thank you for choosing Squid.
>
> > We currently recieve 152 hits/sec on average with
> about 550hps max to our secondary CDN from cache misses at
> the Primary.
> > I would like to serve a lot of this content straight
> from memory to get it out there as fast as possible.
> >
> > I've read around that there are memory and
> processing limitations in Squid in the magnitude of 2-4GB
> RAM and 1 core/1 thread, respectively. So, my solution was
> to run multiple instances, as we don't have the
> rackspace to scale this out otherwise.
> >
>
> Memory limitations on large objects only exist in Squid-2.
> And 2-4GB RAM issues reported recently are only due to
> 32-bit build + 32-bit hardware.
>
> Your 8GB cache_mem settings below and stated object size
> show these are not problems for your Squid.
>
> 152 req/sec is not enough to raise the CPU temperature with
> Squid, 550 might be noticeable but not a problem. 2700
> req/sec has been measured in accelerator Squid-2.6 on a
> 2.6GHz dual-core CPU and more performance improvements have
> been added since then.
>
>
> > I've managed to build a working config of 1:1
> squid:origin, but I am having trouble scaling this up and
> out.
> >
> > Here is what I have attempted to do, maybe someone can
> point me in the right direction:
> >
> > Current config:
> > User Browser -> Prim CDN -> Sec CDN -> Our
> Image server @ http port 80
> >
> > New config idea:
> > User -> Prim CDN -> Squid0 @ http :80 ->
> round-robin to "parent" squid instances on same
> machine @ http :81, :82, etc -> Our Image server @ http
> :80
> >
> >
> > Squid0's (per diagram above) squid.conf:
> >
> > acl Safe_ports port 80
> > acl PICS_DOM_COM dstdomain pics.domain.com
> > acl SQUID_PEERS src 127.0.0.1
> > http_access allow PICS_DOM_COM
> > icp_access allow SQUID_PEERS
> > miss_access allow SQUID_PEERS
> > http_port 80 accel defaultsite=pics.domain.com
> > cache_peer localhost parent 81 3130 name=imgCache1
> round-robin proxy-only
> > cache_peer localhost parent 82 3130 name=imgCache2
> round-robin proxy-only
> > cache_peer_access imgCache1 allow PICS_DOM_COM
> > cache_peer_access imgCache2 allow PICS_DOM_COM
> > cache_mem 8192 MB
> > maximum_object_size_in_memory 100 KB
> > cache_dir aufs /usr/local/squid0/cache 1024 16 256 --
> This one isn't really relevant, as nothing is being
> cached on this instance (proxy-only)
> > icp_port 3130
> > visible_hostname pics.domain.com/0
> >
> > Everything else is per the defaults in squid.conf.
> >
> >
> > "Parent" squids' (from above diagram)
> squid.conf:
> >
> > acl Safe_ports port 81
> > acl PICS_DOM_COM dstdomain pics.domain.com
> > acl SQUID_PEERS src 127.0.0.1
> > http_access allow PICS_DOM_COM
> > icp_access allow SQUID_PEERS
> > miss_access allow SQUID_PEERS
> > http_port 81 accel defaultsite=pics.domain.com
> > cache_peer 192.168.0.223 parent 80 0 no-query
> originserver name=imgParent
> > cache_peer localhost sibling 82 3130 name=imgCache2
> proxy-only
> > cache_peer_access imgParent allow PICS_DOM_COM
> > cache_peer_access imgCache2 allow PICS_DOM_COM
> > cache_mem 8192 MB
> > maximum_object_size_in_memory 100 KB
> > cache_dir aufs /usr/local/squid1/cache 10240 16 256
> > visible_hostname pics.domain.com/1
> > icp_port 3130
> > icp_hit_stale on
> >
> > Everything else per defaults.
> >
> >
> >
> > So, when I run this config and test I see the
> following happen in the logs:
> >
> > From "Squid0" I see that it resolves to grab
> the image from one of it's parent caches. This is great!
> (some show as "Timeout_first_up_parent" and others
> as just "first_up_parent")
> >
> > 1237253713.769 62 127.0.0.1 TCP_MISS/200 2544 GET
> http://pics.domain.com:81/thumbnails/59/78/45673695.jpg -
> TIMEOUT_FIRST_UP_PARENT/imgParent image/jpeg
> >
> > From the parent cache that it resolves to, I see that
> it grabs the image from IT'S parent, originserver (our
> image server). Subsequent requests are 'TCP_HIT' or
> mem hit. Great stuff!
> >
> > 1237253713.769 62 127.0.0.1 TCP_MISS/200 2694 GET
> http://pics.domain.com/thumbnails/59/78/45673695.jpg -
> FIRST_PARENT_MISS/imgCache1 image/jpeg
> >
> >
> > Problem is, it doesn't round-robin the requests to
> both of my "parent" squids and you end up with a
> very 1-sided cache. If I stop the "parent"
> instance that is resolving the items, the second
> "parent" doesn't take over either. If I then
> proceed to restart the "Squid0" instance, it will
> then direct the requests to the second "parent",
> but then the first wont recieve any requests. So I know both
> "parent" configs work, but I must be doing
> something wrong somewhere, or is this all just a silly
> idea...?
> >
>
> This is caused by Squid0 only sending ICP queries to a
> single peer (itself?) on port 3130. Each squid needs a full
> set of its own unique listening ports.
>
> >
> > Can anyone comment on the best way to run a
> high-traffic set of accel cache instances similar to this,
> or how to fix what i've tried to do? Or another way to
> put a LOT of data into a squid instance's memory. (We
> have ~150Million x 2KB images that are randomly requested).
> > I'd like to see different content cached on each
> instance with little or no overlap with round-robin handling
> which squid gets to cache an item and icp handling which
> squid has that item.
> >
> > I'm open to other ideas too..
>
> Some things you have misunderstood:
>
> cache_mem is the size of RAM-cache in used by each Squid
> (same as a cache_dir but without disk parameters) so
> proxy-only reduces its usefullness at the Squid0.
>
> Your Squid0 should not need a cache_mem much larger than
> the max hot-objects throughput for a short (minutes) window
> of data. I'd start with around 500MB and test various
> changes.
>
> Your Squid1/Squid2 should get as much as they can
> possibly hold. Possibly with a heap replacement/garbage
> policy.
>
> miss_access, is followed by implicit "deny all"
> but your stated topology only the external CDN will be
> requesting access to Squid0. So this is counter-productive.
> Leave it at the default 'allow all'.
>
>
> On the hierarchy method itself I'd go with either a
> single Squid doing everything (simpler and one should handle
> the peak load). Or if you must expand to two layers use CARP
> to hash-balance the URL between the parent peers.
>
> round-robin will only balance the request count load. carp
> will balance based on unique URI hashes so no object
> duplication occurs in the second-layer Squid. It will also
> remove any need to do horizontal links to the other sibling,
> since the carp hash guarantees that no other 2nd-layer squid
> has the object fresh.
>
> Amos
> -- Please be using
> Current Stable Squid 2.7.STABLE6 or 3.0.STABLE13
> Current Beta Squid 3.1.0.6

      __________________________________________________________________
Yahoo! Canada Toolbar: Search from anywhere on the web, and bookmark your favourite sites. Download it now at
http://ca.toolbar.yahoo.com.
Received on Tue Mar 17 2009 - 18:33:00 MDT

This archive was generated by hypermail 2.2.0 : Wed Mar 18 2009 - 12:00:02 MDT