Wim Decorte, FMDiSC 14/04/2023
Table of Contents
- Introduction: Why These Concepts Matter
- Resilience: The Foundation of a Stable Solution
- Business Continuity: Keeping the Organization Running
- Disaster Recovery: Reacting and Recovering
- The Importance of Proactive Conversations
- Building Resilience in FileMaker Solutions
- Backups: The Core of Disaster Recovery
- Design Strategies for Resilience
- Distributed Architecture and Microservices
- Infrastructure-as-Code
- Practical Tips for Documentation and Procedures
- Conclusion: Defining Your Role as a Consultant
Introduction: Why These Concepts Matter
As businesses increasingly depend on technology for daily operations, discussions about disaster recovery, business continuity, and resilience become essential, not optional. In his detailed presentation, Wim Decorte from Soliant Consulting tackles the often tricky but necessary conversations that FileMaker developers need to have with clients about these topics. This blog post delves deeper into the points made during his session and offers practical advice for building robust and resilient FileMaker systems.
Resilience: The Foundation of a Stable Solution
What is Resilience?
Resilience in IT refers to the ability of a system to continue functioning even when parts of it fail. In FileMaker, this means your solution should not crash entirely if there’s a minor glitch, hardware failure, or unexpected surge in usage.
The Role of Resilience in FileMaker Systems
Resilience plays a crucial role in ensuring continuous operation. FileMaker solutions that are not resilient might fail at the first sign of trouble, like a power spike or a network failure. By designing your FileMaker solutions with resilience in mind, you create systems that can handle small-scale failures without crashing entirely.
Examples of Resilience in Practice
- Power spikes: A brief outage or power surge should not take down your FileMaker solution.
- Partial failure of hardware: For instance, a database file being corrupted should not halt operations in unrelated parts of the solution.
- Load fluctuations: Resilience also involves ensuring that a sudden spike in users or queries doesn’t result in system crashes.
Business Continuity: Keeping the Organization Running
Why Business Continuity is Critical
Business continuity focuses on the organization’s ability to keep operating, even if one or more of its systems fail. While resilience ensures the system stays functional, business continuity ensures the organization continues to function effectively.
Client Expectations vs Reality
Many clients believe that their system will always run smoothly, often underestimating the importance of a structured business continuity plan. It is the job of the consultant to educate clients on potential risks and prepare them for possible disruptions.
Business Continuity Beyond Technology
Business continuity plans extend beyond the FileMaker solution itself. Consider how a client will continue operations if key hardware, network infrastructure, or critical third-party services go down. Consultants need to help clients map out manual or temporary processes that can keep them afloat during a crisis.
Disaster Recovery: Reacting and Recovering
How Disaster Recovery Differs from Business Continuity
While business continuity is about keeping operations running, disaster recovery focuses on how quickly and efficiently the IT systems can be restored after a failure. A disaster recovery plan provides a roadmap for restoring the system and ensuring that any lost data can be recovered with minimal downtime.
Key Components of a Disaster Recovery Plan
- Data backups: Ensuring that regular backups are taken and stored in multiple locations.
- Restore procedures: Detailed steps on how to recover data and return the system to operational status.
- Testing: Regular testing of disaster recovery procedures to ensure they work as expected.
Types of Disasters
Disasters come in many forms, and your recovery plan needs to account for the different types:
- Environmental: Power outages, fires, earthquakes, and other natural events.
- Hardware Failure: Server crashes, hard drive failures, etc.
- Human Error: Accidental deletions or configuration mistakes.
- Cyber Attacks: Ransomware, viruses, or other malicious intrusions.
Ransomware: A Growing Threat
Ransomware is an ever-growing threat that locks systems and data until a ransom is paid. A robust disaster recovery plan ensures that backups are resilient to ransomware and that they are stored off-site, out of reach from infected systems.
The Importance of Proactive Conversations
Initiating Conversations with Clients
Wim stresses the importance of proactively starting conversations about disaster recovery and resilience. If clients aren’t asking, it’s up to the consultant to bring it up. Waiting until disaster strikes is too late.
Clients of All Sizes Need These Discussions
Even small businesses can benefit from having conversations around disaster recovery. Many smaller clients don’t realize how vulnerable they are, and it’s crucial for consultants to guide them through the potential risks and preventative measures.
The Role of FileMaker Consultants
The consultant’s job is not just to develop a solution but to help clients understand the bigger picture. Consultants should define their role in these critical discussions and ensure that clients have a clear understanding of the risks and their options for resilience.
Building Resilience in FileMaker Solutions
Monitoring and Logging
Essential Logs for FileMaker Solutions
Turning on logging features (like top call stats logs) is essential for monitoring the health of your FileMaker solution. These logs can reveal issues such as long-running scripts, performance bottlenecks, or high CPU usage that, if left unchecked, can lead to system downtime.
Real-Time Monitoring for Proactive Management
Real-time monitoring tools such as Zabbix, Datadog, or New Relic can alert you to early warning signs of failure, such as high memory usage or server processes failing. Implementing such tools ensures that issues are detected and dealt with before they become serious problems.
Key Infrastructure Strategies
Uninterruptible Power Supplies (UPS)
One simple but effective method of ensuring resilience is to connect your FileMaker servers to uninterruptible power supplies (UPS). This will allow servers to stay online during brief power outages and gracefully shut down if needed, minimizing the risk of data corruption.
Backup Power and Network Capacity
In larger deployments, having backup generators or redundant network connections ensures that power or network failures don’t result in long-term outages. The server’s network capacity should be able to handle fluctuations in load and avoid crashes during peak usage.
Handling Variations in Load
To ensure your FileMaker system remains resilient, it’s important to design for scalability. This means anticipating spikes in traffic or usage and ensuring the system can handle increased demands without crashing. Systems should be able to scale up or down dynamically based on load.
Backups: The Core of Disaster Recovery
The 3-2-1 Rule Explained
The 3-2-1 rule is a best practice for ensuring backups are reliable:
- 3 copies of your data should be created, including the original.
- Store backups on at least 2 different types of media (e.g., external hard drive, cloud storage).
- Keep 1 copy offsite, away from the primary location, to guard against site-wide disasters.
Regular Backup Testing
Wim emphasizes the importance of not just creating backups but testing them regularly. It’s common for backups to fail or become corrupted, and without testing, you might not find out until it’s too late. Regularly restore data from backups in a testing environment to ensure it’s valid.
Non-Traditional Backup Approaches
While FileMaker’s native backup functionality is useful, you can enhance your disaster recovery plans by considering non-traditional methods, such as hypervisor-based snapshots. These backups capture system states quickly and can be restored in minutes, minimizing downtime.
Snapshots and Hypervisor-Based Backups
In virtualized environments, hypervisors like VMware or Hyper-V allow for incremental snapshots. These snapshots save only the changes since the last snapshot, allowing for faster backups that take up less space, and are much faster to restore in the event of a disaster.
Design Strategies for Resilience
Avoiding Monolithic Solutions
A monolithic design, where all data, logic, and features are in one giant file, can lead to catastrophic failures. A single point of failure could take down the entire system. Instead, break your solution into multiple, smaller files based on function.
Splitting Tables and Data
Breaking up large tables into smaller, focused files based on functionality or usage patterns can improve both performance and resilience. For example, tables that are rarely updated can be stored separately from transaction-heavy tables.
FileMaker Data Separation Models
The data separation model is a strategy where you keep user interface elements in one file and data tables in another. This makes it easier to update your solution’s UI without affecting the data structure and improves the solution’s overall resilience.
Distributed Architecture and Microservices
What is Distributed Architecture?
In a distributed architecture, different components of your FileMaker solution (like data handling, reporting, or user interfaces) are split into separate systems. These components communicate via APIs or other methods, and each component can be managed independently.
Using External Services for Specialized Tasks
For specialized tasks, such as PDF generation or sending emails, you don’t need to rely solely on FileMaker. Instead, integrate external services via APIs. This reduces the load on your FileMaker server and increases system efficiency and resilience.
FileMaker and API Integration
FileMaker’s ability to integrate with external APIs means you can offload complex tasks to specialized microservices. This enhances resilience by distributing workloads across multiple systems, reducing the chance that a failure in one area will bring down the entire solution.
Infrastructure-as-Code
What is Infrastructure-as-Code (IAC)?
Infrastructure-as-code (IAC) is the practice of managing infrastructure (e.g., servers, databases) through code scripts, rather than manual configuration. This allows you to automate the creation and management of servers and infrastructure resources.
How FileMaker Solutions Can Benefit from IAC
By using IAC, you can quickly spin up or replace FileMaker servers, ensuring a consistent configuration across environments. This speeds up disaster recovery since new infrastructure can be created and deployed in minutes, rather than hours.
Infrastructure from Code: The Future of Automation
In the future, AI-driven infrastructure provisioning could allow FileMaker solutions to automatically generate the necessary resources for specific workloads. This further automates resilience and recovery by ensuring that the infrastructure adapts to the needs of the solution in real time.
Practical Tips for Documentation and Procedures
Documenting Procedures for Disaster Recovery
Document every step of your disaster recovery process, including how to recover data, restart services, and restore infrastructure. This documentation should be updated regularly and made easily accessible for all team members.
Fire Drills: Practicing Your Plan
Conducting fire drills—practice runs of your disaster recovery plan—is critical for ensuring your procedures are effective. Regular drills will reveal any gaps or outdated steps in your recovery plan and allow you to refine it.
Keeping Documentation Up-to-Date
Your disaster recovery documentation is only as good as its last update. Ensure that every change to your FileMaker solution or infrastructure is reflected in your documentation. Assign someone to periodically review and update this documentation.
Conclusion: Defining Your Role as a Consultant
As a FileMaker consultant or developer, you need to decide whether you are simply delivering a technical solution or providing a comprehensive service that includes discussions on disaster recovery, resilience, and business continuity. These discussions should not be an afterthought—they are crucial to the success of your client’s business and the long-term stability of the systems you build.
https://community.claris.com/en/s/question/0D53w000066S9cGCAS/you-you-and-possibly-you