Implementing Access Control System for Microservices Using OPA

This article is a translation of OPAを使用したマイクロサービスのアクセス制御システムの実装 Implementing Access Control System for Microservices Using OPA I've been working on a PoC (Proof of Concept) for an access control system using OPA, and I'd like to share my findings. The design and implementation of this PoC are available in the following repository: bmf-san/poc-opa-access-control-system 1. Introduction 1.1 Background and Challenges First, let's define the key concepts of "Authorization" and "Access Control" as they are used in this article: Authorization Defines the scope of operations a user can perform Grants permissions linked to business logic Controls based on organizational structure and workflows A more abstract concept closer to "business" Access Control Controls access to system resources Restricts access to data and APIs Technical control mechanisms A more concrete concept closer to "system" As I'm involved in SaaS product development, I've been grappling with the complexity and challenges of permission management. As our customers' organizational structures, business flows, and data access patterns become more diverse, I've identified several challenges in permission management systems: Handling Complex Permission Requirements Flexible permission settings Customizable permission models Individual requirements for systems using the permission management system Extensibility and Maintainability Adding new features and permission patterns Modifying existing permission logic Complexity in testing and debugging Balance between Flexibility and Consistency Consistent permission application across the system Performance trade-offs Trade-offs between permission flexibility and system complexity To address these challenges, the permission system architecture should meet the following requirements: Separation from Business Logic Independent evolution of permission logic Minimize impact on business logic Flexible permission model implementation Fine-grained Control Resource-level control Field-level control Dynamic data filtering Context-aware decision making Extensible Design Addition of new permission models Custom rule implementation Ensuring scalability 1.2 About Open Policy Agent (OPA) As a solution to these challenges, I considered adopting Open Policy Agent (OPA), a CNCF Graduated project, for the access control system. OPA is a policy engine with the following characteristics: Policy as Code Manage policies as code Easy version control and review process Testable policy descriptions Declarative Policy Description Intuitive policy description using Rego language High readability and maintainability of policy logic Easy modularization and reuse Separation from Services Policy decisions can be implemented as independent services Complete separation of application code and policies Dynamic policy updates possible The advantages of adopting OPA include: Affinity with Microservices Can operate independently as a service Easy integration through REST API High performance and lightweight runtime Rich Features Field-level access control Complex policy rule description Comprehensive test support Active Community CNCF Graduated project Comprehensive documentation Proven adoption cases 2. Access Control System Design 2.1 Architecture Overview This system adopts a proxy-based architecture composed of the following main components: Policy Enforcement Point (PEP) Acts as a reverse proxy Intercepts all requests Implements access control in cooperation with PDP Applies response data filtering Policy Decision Point (PDP) Policy evaluation engine using OPA Access decisions based on RBAC model Applies field-level filtering rules Utilizes context information in cooperation with PIP Policy Information Point (PIP) Provides information necessary for policy decisions Manages user information and roles Provides organizational structure data Cooperates with PRP Policy Retrieval Point (PRP) Persists policy-related data Maps roles and permissions Associates users and roles Manages access control settings 2.2 Access Control Flow The basic request flow is as follows: Client sends request PEP intercepts request and extracts necessary information PDP executes policy evaluation PIP provides additional context information Controls access based on policy evaluation results Applies filtering to response data Returns results to client 3. Implementation Points 3.1 PEP Implementation Patterns The following patterns can be considered for implementing PEP (Policy Enforcement Point): Proxy-based Implementation Reverse proxy specialized in single functionality (access control) Individually placed in front of each microservice Handles only access control, no other functionalities No changes required to servi

Mar 21, 2025 - 02:22

This article is a translation of OPAを使用したマイクロサービスのアクセス制御システムの実装

Implementing Access Control System for Microservices Using OPA

I've been working on a PoC (Proof of Concept) for an access control system using OPA, and I'd like to share my findings.

The design and implementation of this PoC are available in the following repository:

bmf-san/poc-opa-access-control-system

1. Introduction

1.1 Background and Challenges

First, let's define the key concepts of "Authorization" and "Access Control" as they are used in this article:

Authorization

Defines the scope of operations a user can perform
Grants permissions linked to business logic
Controls based on organizational structure and workflows
A more abstract concept closer to "business"

Access Control

Controls access to system resources
Restricts access to data and APIs
Technical control mechanisms
A more concrete concept closer to "system"

As I'm involved in SaaS product development, I've been grappling with the complexity and challenges of permission management.

As our customers' organizational structures, business flows, and data access patterns become more diverse, I've identified several challenges in permission management systems:

Handling Complex Permission Requirements
- Flexible permission settings
- Customizable permission models
- Individual requirements for systems using the permission management system
Extensibility and Maintainability
- Adding new features and permission patterns
- Modifying existing permission logic
- Complexity in testing and debugging
Balance between Flexibility and Consistency
- Consistent permission application across the system
- Performance trade-offs
- Trade-offs between permission flexibility and system complexity

To address these challenges, the permission system architecture should meet the following requirements:

Separation from Business Logic
- Independent evolution of permission logic
- Minimize impact on business logic
- Flexible permission model implementation
Fine-grained Control
- Resource-level control
- Field-level control
- Dynamic data filtering
- Context-aware decision making
Extensible Design
- Addition of new permission models
- Custom rule implementation
- Ensuring scalability

1.2 About Open Policy Agent (OPA)

As a solution to these challenges, I considered adopting Open Policy Agent (OPA), a CNCF Graduated project, for the access control system.

OPA is a policy engine with the following characteristics:

Policy as Code
- Manage policies as code
- Easy version control and review process
- Testable policy descriptions
Declarative Policy Description
- Intuitive policy description using Rego language
- High readability and maintainability of policy logic
- Easy modularization and reuse
Separation from Services
- Policy decisions can be implemented as independent services
- Complete separation of application code and policies
- Dynamic policy updates possible

The advantages of adopting OPA include:

Affinity with Microservices
- Can operate independently as a service
- Easy integration through REST API
- High performance and lightweight runtime
Rich Features
- Field-level access control
- Complex policy rule description
- Comprehensive test support
Active Community
- CNCF Graduated project
- Comprehensive documentation
- Proven adoption cases

2. Access Control System Design

2.1 Architecture Overview

This system adopts a proxy-based architecture composed of the following main components:

Policy Enforcement Point (PEP)

Acts as a reverse proxy
Intercepts all requests
Implements access control in cooperation with PDP
Applies response data filtering

Policy Decision Point (PDP)

Policy evaluation engine using OPA
Access decisions based on RBAC model
Applies field-level filtering rules
Utilizes context information in cooperation with PIP

Policy Information Point (PIP)

Provides information necessary for policy decisions
Manages user information and roles
Provides organizational structure data
Cooperates with PRP

Policy Retrieval Point (PRP)

Persists policy-related data
Maps roles and permissions
Associates users and roles
Manages access control settings

2.2 Access Control Flow

The basic request flow is as follows:

Client sends request
PEP intercepts request and extracts necessary information
PDP executes policy evaluation
PIP provides additional context information
Controls access based on policy evaluation results
Applies filtering to response data
Returns results to client

3. Implementation Points

3.1 PEP Implementation Patterns

The following patterns can be considered for implementing PEP (Policy Enforcement Point):

Proxy-based Implementation
- Reverse proxy specialized in single functionality (access control)
- Individually placed in front of each microservice
- Handles only access control, no other functionalities
- No changes required to services
- Pattern adopted in this PoC
Library-based Implementation
- Embedded as a library in each service
- Integrated with application code
- Enables more fine-grained control
- Requires service modifications
Sidecar Pattern
- Used in container environments like Kubernetes
- Deployed as a sidecar to each service's Pod
- Maintains separation between service and PEP
- High affinity with container orchestration
API Gateway Integration
- Functions as system-wide entry point
- Multi-functional including routing, authentication, rate limiting
- Centrally manages traffic to all services
- Handles cross-cutting concerns beyond access control
- Risk of becoming a single point of failure

We chose proxy-based implementation for this PoC for the following reasons:

Enables independent access control for each service
Clear separation of access control responsibilities
No dependencies on other functionalities
Flexible control according to each service's requirements
Lightweight and simple implementation compared to API gateway

Here's an example of proxy-based implementation:

func (p *Proxy) handleRequest(w http.ResponseWriter, r *http.Request) {
    // Get user ID
    userID := r.Header.Get("X-User-ID")
    if userID == "" {
        http.Error(w, "X-User-ID header is required", http.StatusBadRequest)
        return
    }

    // Identify resource and action
    resource := extractResource(r.URL.Path)
    action := "view" // This PoC only supports GET method

    // Evaluate access with PDP
    allowed, filteredData, err := p.evaluateAccess(userID, resource, action)
    if err != nil {
        http.Error(w, err.Error(), http.StatusInternalServerError)
        return
    }
    if !allowed {
        http.Error(w, "Forbidden", http.StatusForbidden)
        return
    }

    // Proxy forwarding and response filtering
    response := p.forwardRequest(r)
    filteredResponse := p.applyFiltering(response, filteredData)
    w.Write(filteredResponse)
}

3.2 Access Control Models and Policy Implementation

3.2.1 Supported Access Control Models

OPA is a flexible policy engine that can implement various access control models:

RBAC (Role-Based Access Control)
- Role-based access control
- Implementation model in this PoC
- Assigns roles to users
- Grants permissions to roles
ABAC (Attribute-Based Access Control)
- Attribute-based access control
- User attributes (department, position, etc.)
- Resource attributes (confidentiality level, owner, etc.)
- Environmental attributes (time, location, etc.)
ReBAC (Relationship-Based Access Control)
- Relationship-based access control
- Social graph-like relationships
- Organizational hierarchy-based control
Other Models
- MAC (Mandatory Access Control)
- DAC (Discretionary Access Control)
- Combinations of these possible

3.2.2 Policy Definition Approaches

There are two main approaches to policy definition:

Post-filtering Approach

   # Example implementation: Filtering after data retrieval
   allowed_fields[field] {
       roles := user_roles[input.user_id]
       some role in roles
       field_permissions := role_field_permissions[role]
       field = field_permissions[_]
   }

Pre-filtering Approach

   # Query Generation Example: Filtering before data retrieval
   generate_sql_query {
       roles := user_roles[input.user_id]
       allowed_fields := get_allowed_fields(roles)
       query := sprintf("SELECT %s FROM employees WHERE %s",
           [concat(", ", allowed_fields), build_conditions(roles)])
   }

Trade-off Considerations

Post-filtering (Implemented in this PoC)
- Advantages:
  - Simple implementation
  - Easy database query optimization
  - Easy cache utilization
- Disadvantages:
  - Retrieval of unnecessary data
  - Increased memory usage
  - Network bandwidth waste
Pre-filtering
- Advantages:
  - Resource efficiency optimization
  - Minimal data retrieval
  - Improved scalability
- Disadvantages:
  - Complex query generation logic
  - Difficult database optimization
  - Complex caching strategy

Selection Guidelines:

Use pre-filtering for large data volumes
Use post-filtering for simple requirements
Choose based on performance requirements

3.2.3 Implementation Example

This PoC implements RBAC model with the following policy:

package rbac

# Deny by default
default allow = false

# Access permission rules
allow {
    # Get user's roles
    roles := user_roles[input.user_id]

    # Check resource and action permissions
    some role in roles
    permissions := role_permissions[role]
    some permission in permissions
    permission.resource == input.resource
    permission.action == input.action
}

# Field-level filtering
allowed_fields[field] {
    roles := user_roles[input.user_id]
    some role in roles
    field_permissions := role_field_permissions[role]
    field = field_permissions[_]
}

This policy achieves:

Denies all access by default
Checks permissions based on user roles
Returns only allowed fields

3.3 Data Model Design

The following schema is implemented using PostgreSQL:

-- PRP Database
CREATE TABLE roles (
    id UUID PRIMARY KEY,
    name VARCHAR(255) NOT NULL
);

CREATE TABLE users (
    id UUID PRIMARY KEY,
    name VARCHAR(255) NOT NULL
);

CREATE TABLE user_roles (
    user_id UUID REFERENCES users(id),
    role_id UUID REFERENCES roles(id),
    PRIMARY KEY (user_id, role_id)
);

CREATE TABLE role_permissions (
    role_id UUID REFERENCES roles(id),
    resource_id UUID REFERENCES resources(id),
    action_id UUID REFERENCES actions(id),
    PRIMARY KEY (role_id, resource_id, action_id)
);

This schema enables:

Flexible user-role associations
Role-based permission management
Clear definition of resources and actions

4. Insights from Implementation

4.1 Advantages of OPA

Separation of Policies and Applications
- Policy changes don't affect application code
- Independent version control and deployment of policies
- Consistent policy application across services
Declarative Policy Description
- Intuitive policy implementation with Rego
- High policy logic readability
- Easy unit testing
High Flexibility
- Field-level fine-grained control
- Dynamic context-based decisions
- Complex rule implementation

4.2 Implementation Challenges

Learning Curve
- Need to learn Rego language
- Debugging can be difficult
- Policy design complexity
Performance Impact
- Slight overhead from proxy
- Additional latency from policy evaluation
- Need for caching strategy
Operational Complexity
- Multiple service management
- Policy distribution and updates
- Monitoring and troubleshooting

4.3 Design Considerations

To address implementation challenges, consider the following approaches:

Performance Optimization
- Policy evaluation result caching
- Minimal context retrieval
- Efficient database queries
Error Handling
- Clear error messages
- Fallback strategies
- Detailed logging
Testability
- Policy unit testing
- Automated integration testing
- Test environment preparation

5. Results and Future Prospects

Through this system, we confirmed that an access control system using OPA is effective in the following aspects:

Access Control Consistency
- Unified policy application across services
- Highly maintainable implementation
- Flexible permission management
Development Efficiency
- Separation from business logic
- Policy reusability
- Policy readability
- Easy testing
Operational Benefits
- Limited scope of policy changes
- Independent policy deployment
- Policy isolation

However, we also identified challenges that need consideration, such as learning costs and infrastructure complexity. These challenges require appropriate education, tool preparation, and a phased introduction approach.

Personal Thoughts

I believe that the most important logic in a permission management system is the description and implementation of policies, and implementing this with OPA can enhance flexibility and maintainability.

In particular, the ability to separate from business logic is a significant advantage, as changes to access control don't affect application code.

When policies and application code are tightly coupled, communication costs increase between teams developing the permission system and teams developing systems that use it. This becomes particularly important when permission management requirements need to adapt flexibly as the product grows.

Additionally, the ease of testing policies is crucial for maintaining policy quality. Automating policy testing can reduce risks associated with policy changes.

While we haven't implemented performance optimizations in this PoC, adopting OPA in large-scale systems would require performance-conscious implementation and policy design, such as caching policy evaluation results and optimizing database queries.

Regarding migration from existing systems that don't use policy engines like OPA, I found that extracting and separating policies, along with phased introduction, is both important and challenging.