Coded something up in Couch in an interesting way? Have a snippet or shortcode to share? Post it here for the community to benefit.
6 posts Page 1 of 1
I have been sitting on this for some time now and thought it was about time to write it up.SQL queries can be used in place of native <cms:pages> tags to speed up page load by up to 90x, which is a huge jump.

We wanted to create a graph that showed user activity (represented by cloned pages) over the course of a typical week. This was our previous code

Code: Select all
<cms:pages masterpage='activity.php' skip_custom_fields='1' limit='1000000'>
<cms:set day = "<cms:date k_page_date format='D'/>"/>
<cms:put var="add_day" value="<cms:get day/>"/>
<cms:put var="<cms:show day/>" value="<cms:add add_day '1'/>" scope='global' />
</cms:pages>


It basically ran through each cloned page, checked what day it was published and then added '1' to the running total. This did the trick but the page took 95.878 seconds to load and relied upon 38733 queries. With the help of Trendoman and a very basic understanding of constructing SQL SELECT queries, this was reduced to take 1.592 seconds to load and relied upon 33 queries. I've included the code below.

Code: Select all
<cms:repeat count='7'>
<cms:if k_count = '0'><cms:set day = "Mon"/></cms:if>
<cms:if k_count = '1'><cms:set day = "Tue"/></cms:if>
<cms:if k_count = '2'><cms:set day = "Wed"/></cms:if>
<cms:if k_count = '3'><cms:set day = "Thu"/></cms:if>
<cms:if k_count = '4'><cms:set day = "Fri"/></cms:if>
<cms:if k_count = '5'><cms:set day = "Sat"/></cms:if>
<cms:if k_count = '6'><cms:set day = "Sun"/></cms:if>
<cms:query sql=" SELECT p.id pid, t.name tname FROM <cms:php>echo K_TBL_PAGES;</cms:php> p inner join <cms:php>echo K_TBL_TEMPLATES;</cms:php> t on p.template_id = t.id WHERE t.name='activity.php' AND publish_date < '<cms:date format='Y-m-d H:i:s' />' AND DATE_FORMAT(publish_date,'%a')='<cms:show day/>' limit='1' skip_custom_fields='1'>
<cms:put var="<cms:show day/>" value="<cms:show k_total_records/>" scope='global'/>
</cms:query>
</cms:repeat>


The code above could be optimised even further by using %N as the PHP date format to prevent the need for setting the 'day' variable at all.

Again this is a niche application but I am certain there are other uses for SQL queries to speed up page load times significantly when using large data sets.

Here are is another speed improvements

from 80.867 sec to 3.155 sec
from 38737 queries to 67 queries
Great job, awesome result!

I must say, I maybe only inspired this achievement, but thanks for the feedback. You did the learning yourself and the result is clearly impressive. It also confirms (again), CouchCMS is a great tool in right hands.

Speed-wise, this can be really improved maybe 2x more. Currently, provided query looks for all matched pages, instead of counting them. I guess with your increasing number of pages some milliseconds might be saved, bringing maybe 1.5x-2x improvement.

And, secondly, with some deep interest in SQL language, it can be requested to make database return a simple table with 7 rows, with day name in the first column and count in the second. Nowadays it might be even considered as a hacker's skill :D
What changes need to be made to only count the pages rather than fetch them?

Is his possible with select queries or do I need something else?
Yes, it is possible with SELECT. Counting is done with count(*) or count(p.id), which are both effective and both ultimately refer to the table's primary index.

Now, I made some practical testing over a large dataset and must take back my previous words. There's almost no difference between your solution and counting (COUNT still must have, if you want to do GROUP BY). You placed limit='1', which pretty much does the job very well in minimizing the work for database. So changing that to count doesn't make much sence.

However, most of the time, database algos works much faster than php's. Calling the SQL query 7 times from a php loop (cms:repeat in our case) are more costly, than letting database work a little bit more for us and do it all once in one go.

We should be getting a reply from database as follows (only 2 days in my dataset):
ScreenCut-03-19---19-09-16-.png
db-table formatted
ScreenCut-03-19---19-09-16-.png (2.93 KiB) Viewed 3295 times


So I took the time to rewrite the whole thing, with counts and grouping. It's fun and I love messing with SQL :)
This fixed missing closing double quote for the sql part in your post. Also variables are expected by couchcms to be in lowercase, so I adressed this as well. Notice that we don't use skip_custom_fields parameter. Let's see what is going on:

Basic SQL query, which is pasted to any SQL tool:

Code: Select all
SELECT LOWER(DATE_FORMAT(p.publish_date,'%a')) AS 'weekday', COUNT(p.id) AS 'total'
FROM couch_pages p
INNER JOIN couch_templates t ON t.id=p.template_id
WHERE t.name='activity.php'
-- excluding future entries:
AND p.publish_date < NOW()
-- excluding unpublished pages:
AND NOT p.publish_date = '0000-00-00 00:00:00'
-- aaand ta-dam - output counts by weekday
GROUP BY DATE_FORMAT(p.publish_date,'%a')

Ask if something is unclear. With this, my results went from 7 times x 0.3 sec (2.1 sec) to 1 time * 0.7 sec. Which is like 300% improvement.

For the CMS part we don't need anymore the cms:repeat loop. Now, let's make it all look nice in CouchCMS:

Code: Select all
<cms:capture into='sql'>
    SELECT LOWER(DATE_FORMAT(p.publish_date,'%a')) AS 'weekday', COUNT(p.id) AS 'total'
    FROM <cms:php>echo K_TBL_PAGES;</cms:php> p
    INNER JOIN <cms:php>echo K_TBL_TEMPLATES;</cms:php> t ON t.id=p.template_id
    WHERE t.name='activity.php'
    -- excluding future entries:
    AND p.publish_date < NOW()
    -- excluding unpublished pages:
    AND NOT p.publish_date = '0000-00-00 00:00:00'
    -- aaand ta-dam - output counts by weekday
    GROUP BY DATE_FORMAT(p.publish_date,'%a')
</cms:capture>

<cms:query sql=sql>
    <cms:put var="<cms:show weekday />" value="<cms:show total />" scope='global'/>
</cms:query>


Finally, the only possible caveat here is that database returns only weekdays that exist. So, if no pages are registered in backend on, say, Tuesday, then the variable tue will not be set. I think it's still a very minor issue and can be wrapped in a simple check, for example:
Code: Select all
Monday: <cms:if mon ><cms:show mon /><cms:else />0</cms:if>
Tuesday: <cms:if tue ><cms:show tue /><cms:else />0</cms:if>
Wednesday: ..
Thursday: ..
Friday: ..
Saturday: ..
Sunday: <cms:if sun ><cms:show sun /><cms:else />0</cms:if>


Even this can be elegantly generalized like this (good for multi-language setups):
Code: Select all
<cms:repeat count='7' >
    <cms:set weekday_title = "<cms:zebra 'Monday' 'Tuesday' 'Wednesday' 'Thursday' 'Friday' 'Saturday' 'Sunday' />" />
    <cms:set weekday_short = "<cms:zebra 'mon' 'tue' 'wed' 'thu' 'fri' 'sat' 'sun' />" />
   
    <cms:show weekday_title />: <cms:if "<cms:get weekday_short />"><cms:get weekday_short /><cms:else />0</cms:if>
</cms:repeat>


Could you please test the new query and post results here?
I'll test that out later today

I've been using the SQL SUM function and this reduced my query count for another function by 97%. We wanted to add up all the "page-views" field in each clonable page. Previously I did this by iterating through the cms:pages tag and using the cms:add tag. The problem was that we had 500 clones pages and so it took a while.

I resulted to using trendoman's caching solution posted on his tips posts (which I'd highly recommend!) but this meant data wasn't up to date all the time. I'll post the SUM code soon
Here is the code for a slightly different function but one that benefitted most from using the GROUP BY function.

Code: Select all
<cms:query sql="SELECT LOWER(DATE_FORMAT(p.publish_date,'%H')) AS 'hour', COUNT(*) AS 'total' FROM <cms:php>echo K_TBL_PAGES;</cms:php> p INNER JOIN <cms:php>echo K_TBL_TEMPLATES;</cms:php> t ON t.id=p.template_id WHERE t.name='activity.php' AND p.publish_date >= '<cms:date return='today' format='Y-m-d 00:00:00'/>' GROUP BY DATE_FORMAT(p.publish_date,'%H')">
<cms:put var="variable_<cms:show hour/>" value="<cms:show total/>" scope='global'/>
</cms:query>


The amount of queries was 280 before and is now 201 and the function is 1s quicker. The page still takes 5.3 seconds but I suspect this is to do with being on a slow server and still using PHP 5.6
6 posts Page 1 of 1