Candlesticks Pattern Matching
Here we make considerations about candlesticks pattern matching in Trading Conceiver; the intricacies of pattern matching and our approach to accomplish it.
Intricacies include:
- Patterns are not mathematical formulae
- Different descriptions
- Arbitrary values
- Mismatch between description and examples
- Canonical figures
- Bullish / bearish versions can differ
- Sight and perception can be deceptive
- Minimum requirements
- Psychology
- Number of occurrences
- Number of input parameters
- Not very specific candles
- Not mutually exclusive
- Prior trend
Intricacies in Pattern Matching
When coding software routines to implement patterns, many difficulties pop up. Here we report a list of them. The purpose is not to involve the user in the complexity of such a task, but to let him better understand the results he might get from our tool.Patterns Are not Mathematical Formulae
Patterns in general, and candlestick ones don't make an exception, are not defined through mathematical formulae. Basically all authors describe them in words, not showing formulae, so they are not rigorously well defined. When a software engineer needs to implement them, he needs to translate the words into algorithms, mathematically defined directives. No matter how detailed a description is. It is not a mathematical formula. It is not like the definition of an indicator, like the Simple Moving Average (SMA), where a formula exists. Whoever is calculating the SMA, the result will always be the same. This is not the case for patterns. The pattern recognition will be dependent on the 'translation' from words to algorithm. Hence different pattern recognition software will output different results, i.e. there will not be a one-to-one correspondence of patterns found between different software scanners. There is not a right and a wrong recognition software. They are just different, differently interpreting the pattern, differently implemented, subject to the individual sensibility and personal knowledge. Even when the description might look very detailed and precise, when looking at the results when implemented, it usually appears clearly right away that it is faulty, maybe because it recognizes configurations not very similar to the 'canonical' pictures. In Trading Conceiver, we implemented our version of software. Because of the above-mentioned problem, we explicitly state all and only the conditions we check to match a pattern, i.e. the conditions necessary and sufficient, meaning all of them must be true, and there are no other checks performed. You can find this information in theFormulae
section of the description of the candle pattern.
Pattern in Formulae
To be more explicit, formulae of patterns can definitely be found, but they are very different from each other, because they have been obtained as just described, by translating words into formulae.Different Descriptions
To exacerbate the problem is that different authors give slightly different descriptions of the same pattern. We report an example about the Three Stars in the South further below. Again, this holds even when examining formulae versions of the pattern. The same author might even accept different variations of the pattern. Furthermore, some rules might be considered as mandatory by some but as signal enhancements by others. The latter means that a certain condition is not required, but, if present, makes the signal stronger. Typical examples are:- The longer the first candle, the more forceful the reversal signal.
- A gap between the first two candles adds to the probability of the reversal.
- The longer the upper shadow, the higher the potential of a reversal.
Arbitrary Values
Sometimes, in the description or formula of a pattern, some arbitrary values can be found. Typical examples could be:- The lower shadow length must be at least two times that of the body.
- The body must span at least 75% of the whole high-low range.
Mismatch Between Description and Examples
This might look surprising, but it is not rare to find, in the same source, a certain description of the pattern and then charts examples not matching the description. When this happens, usually the examples require less strict rules. So should the software adhere to the stricter description or the looser examples? We suggest that the user looks thoroughly at the examples the author offers when studying candlesticks patterns. Indeed such examples should come from the real world, not drawn purposely by hand. This way things will be clearer.Canonical Figures
As a corollary of the previous point, the canonical figure representing the pattern, the one appearing in every source, resulting from a drawing and not taken from a real stock chart, is only partially explicative. Sometimes that drawing represents just an example, maybe the most meaningful, an ideal case, the epitome of the pattern. How much a real configuration can diverge from that picture? How much longer can a candle be? How much lower its position with respect to the others?Bullish / Bearish Versions Can Differ
This is surprising, too. When describing both the bullish and the bearish version of the pattern, which should match substituting dual quantities, the same author sometimes gives slightly different requirements. It is never clear whether it is intentional or just the fact that the pattern is not so well defined, exactly.Sight and Perception Can Be Deceptive
When we look at a certain candle configuration to decide whether it constitutes a pattern, we are relying on our perception, which can be misleading. Let's look at the example depicted here, representing Doji. Probably, someone might be tempted to consider the one on the right as a Doji, and the one on the left as a long-legged Doji. And yet, they have exactly the same body and the same shadows, so they are exactly the same candle. They are just drawn with a different width (maybe exaggerated here for convenience), which means nothing, it is irrelevant. This is just one example, involving the form factor, i.e. the width vs the height, but there are many others. For instance, the different zoom at which we are observing a graph could lead us to different conclusions.An Example: the Notorious 'Long' Adjective
As an example we want to dwell upon the adjective 'long' that appears in many descriptions, referring to candles. Here is a list of various interpretations of what a 'long' candle should be according to different authors:- The body of the candle must be long with respect to the previous candles' body. [1]
- The body of the candle must be long with respect to the previous candles' high-low range.
- The whole high-low range of the candle must be long with respect to the previous candles' high-low range. [2]
- As in [1], and the shadows must be short compared with the body. [3]
- [1] and [2] simultaneously.
- ([1]), [2] and [3] simultaneously.
- The threshold for considering something 'long'. What is the value above which we can state the candle is long with respect to the previous ones? 75%? 120%? Three times?
- The lookback period. How many previous candles should we look at?
- The kind of average to use for the length of the previous candles. It could be a simple moving average (SMA), an exponential moving average (EMA) or anything else.
- Some sources suggest numbers for the threshold, e.g. 3 times greater than the average of the preceding candles. When putting in such a high number, the risk is that the patterns become so rare, that any result in the trading system is statistically meaningless.
- Looking at the real life examples proposed by the authors, sometimes 'long' appears to mean more probably 'not short'. This refers again to the subjectivity of the threshold for 'long'.
- Sometimes it is clear that 'long' doesn't refer to a comparison with the previous candles, but simply to the other candles of the pattern.
- Where do we start exactly to look backward for the 'previous' candles? Simply before the pattern or before the candle within the pattern required to be 'long'?
- What might look short when zoomed out, might look long when zoomed in.
This is not trivial. When selecting the
Fit Viewport
option in Trading Conceiver charts, the zoom changes continuously when shifting the charts horizontally, and the effect of candles 'becoming' long or short is apparent. - Candles with long bodies could appear more prominent to the eye, which could 'filter out' those with short bodies. So, when deciding mentally whether a candle is long or short, we could be misled by that.
Another Example: 'Gap'
The 'gap' is another dreadful word to the software programmer. Different authors might confer a different meaning to the gap between two candles. Here are some possibilities.- Body gap, i.e. considering only the bodies.
- Whole range gap, i.e. considering shadows, too.
- Opening gap, i.e. considering only the open price; again relative to the previous body or whole range.
- It also exists a gap which, in its 'up' version, considers the high of the first candle and the lower between the open and the close of the second candle.
Our Approach
In order to decide the conditions that must be satisfied for a certain pattern, we evaluate all the following considerations.Minimum Requirements
We prefer to pick out a set of minimum requirements, comparing those from various sources. So we tend to demand the lowest possible number of rules. Obviously, they must be a reasonably complete set, not just an intersection. In particular, we tend to omit rules that we deem just signal enhancements.Psychology
When in doubt, we refer to the psychology underlying the pattern, that is its rationale, to refine the choice and as a guidance. Let's make an example, the Bullish Breakaway, pictured here. We implemented it like this:- n-4, n-3, n-1 are black
- n-2 can be black or white
- n is white
- body gap down between n-4 and n-3
- avg(n-3) ≥ avg(n-2) ≥ avg(n-1)
- open(n-3) < close(n) [1]
- The gap between n-4 and n-3 must include also the shadows, not just the bodies. This is a choice that doesn't change the psychology of the pattern. Both are legitimate. We opted for the bodies version, because that seems the favorite in general among authors.
- Some authors express the downtrend through conditions on opens and/or closes of the candles. We used the average instead. In all cases, the psychology is respected: there is an ongoing downtrend. So we didn't add any condition on open or close values. We think our constraint is better, though, because the third candle can be any color, and using open or close values is faulty.
- Some require open(n) not in the direction of the downtrend, e.g. some demand open(n) > close(n-1). We didn't add this rule because we deem it a signal enhancement rather than a requirement. Although omitting it, we think we still complied with the pattern psychology. Here, even the number of occurrences comes into play, see next section.
- Instead of [1], some require the last candle closing exactly in the gap between the first two, i.e. not closing the gap completely. We think that with our weaker assumption [1] we still honored the rationale of the pattern. Here again the number of occurrences must be factored in.
Number of Occurrences
We take into consideration also the number of occurrences to define the list of requirements. Obviously, we try to keep the core concept intact.Too Rare
In order for a pattern to be useful, it must not occur too rarely, or the study of the trading system will be statistically meaningless. So if a pattern is already extremely rare with a few rules, we try not to add even more conditions to be met, and maybe to relax some of the existing ones. In particular, if only the canonical figure is accepted, usually the occurrences are very few. Moreover, we made the decision to not implement patterns occurring virtually never, see Implemented Patterns.Too Frequent
By the same token, if a pattern happens too frequently, probably is not very useful as well. If it is always there, probably can't give a strong signal. In that case, we try to add some requirements, to limit its frequency.Number of Input Parameters
This has been a very strong design decision.The Code Is Already Implemented
The Composer already accepts input parameters for each trading algorithm, so accepting input parameters for candle patterns comes for free. It is already there, already coded, already available, no effort on our part.Too Cumbersome With Many Parameters
However, we decided to limit as much as possible the number of input parameters for candlesticks. Each added parameter would of course increment the flexibility for the user, who could tailor the pattern search algorithm to his needs, but it would also increase the difficulty in using it. We believe that introducing parameters to control pattern recognition would render the software too cumbersome and an exhausting process for the user. So we limited as much as possible the number of parameters, and used them only when strictly necessary. Please note that there are some trading algorithms in theTechnical Indicators Based
branch with quite a number of parameters,
but they are basically the same for all of them. Apart the parameters for the specific technical indicator,
all the other parameters are always the same (MA, slope...). For candlesticks, on the contrary, each parameter would be specific to the pattern.
It would be a nightmare for the user to understand what all the parameters mean, an input pertinent values.
The number of parameters required could easily sky rocket, see the following example.
Probably we would also need to supply the user with some mean to save his preferences for default values, and this for each pattern.
When a Parameter Is Added
If an arbitrary number is required for a certain condition, which would call for an input parameter, we tend to dismiss that rule. We add a parameter when a core rule must be introduced and it requires a number that would be too subjective and arbitrary to hard-code. For instance, in the Bottom Tweezers, exemplified here, the core requirement is that the lows of the two candles should be almost equal. In this case, we must introduce this rule. Deciding what 'almost' means is arbitrary. We had to let the user decide for himself and introduced an input parameter.Example
Let's make an example, with the Three Stars in the South, a simple 3 black candles pattern, depicted here. We decided to implement it like this, with the following conditions that must be satisfied on day n:- high-low range(n-1) inside high-low range(n-2) [1]
- high-low range(n) inside high-low range(n-1) [2]
- n-2 with
- no upper shadow [3]
- virtually no upper shadow [4]
- lower shadow of n-1
- pretty long [5]
- shorter than lower shadow of n-2 [6]
- body(n-1) shorter than body(n-2) [7]
- close(n-1) < close(n-2) [8]
- low(n-2) < low(n-1), instead of [1]. [9]
- candle n must be small. [10]